Web app questions

Post by **Cakedaddy** » Mon Jul 22, 2024 1:58 pm

As you know, I've been building a web app that does stuff for us.

I have a page that displays ticket data. The data is pulled from many different tables. So each time I want to view a different table, there's about 8 queries that have to run to pull all the data from each table. There is noticeable lag as each query is run, and the data populates the fields.

First question: Should I have one query with 8 inner joins pulling everything at once, or 8 different queries like I have now. I feel like the inner joins are better, but also didn't know if that was bad practice, as in, having too many joins.

Second question: If I just wanted to run more basic queries, that pulled every record from every table, then each field hits those query results to display data. I would only have to run each query once, then all the data is available regardless of which 'ticket' I wanted to look at, making changing tickets much faster. (cause you didn't know that. . .) Question is, where is that data stored? Is it local to the browser that's running the app, or on the AWS server that is hosting the app?
I know this has to be considered bad practice (pulling all records so they are in memory), but my record count is low enough that I can get away with it. Then as I learn more about this stuff, I can make changes later to do it better. If it's stored with the browser running the app, that's no good because the clients are on phones. But if it's local to the AWS server, then no big deal.

Post by **TheCatt** » Mon Jul 22, 2024 2:24 pm

The 2nd question depends on your technology stack.

If all the data is for a given ticket, then using a single query that represents its data is good. If you have lookup tables that are pretty much always the same (possible status fields, technicians, etc) those should be cached in server memory and not requeried.

Post by **Cakedaddy** » Tue Jul 23, 2024 10:07 pm

I rewrote the query with a bunch of INNER JOINS. While the initial load of the screen takes a few seconds longer, interacting with the data is much faster, and more stable. For some reason, the DB will sometimes refuse to answer a query. This leaves you with empty fields. Over all, a big improvement.

Next question:
When creating a ticket, I have to create records in multiple tables. Create tech assignment records, invoice records, attached document records, etc. Right now, I have multiple queries inserting into each table. Some are actually dependent on previous queries finishing. For instance, the first thing I do is insert the new ticket. This generates the ticketID. I then grab that ticketID and use it in all the other records as the common field. Sometimes, again because the DB I use is not 100% reliable, as is the app tool I'm using, it seems like the other queries will run before the ticket is created. This results in them using the previous ticket's ID. Obviously a problem. Plus there's the time when the DB will just refuse a query, and not create the other records that are needed. So I'm looking at transactions, which you mentioned in another thread. But I get the impression that a transaction has to be contained in a single query. Can a single transaction span multiple queries? If not, is it bad practice to have one query that inserts into multiple tables. Everything in one, I can do the BEGIN/COMMIT per the tutorials I've seen. None of them talk about doing it across multiple queries.

Post by **Cakedaddy** » Wed Jul 24, 2024 2:13 am

As is often the case, typing out the issue here causes rethinking stuff and thinking of different ways to search for help.

Ticket creation is now one query. I do NOT have it explicitly set as a transaction. I'm told that it is automatically considered one in postgresql. That being said. . . one of the queries will find a recently added document. As in, when I add a document, it creates the record with 0 as the ticketID, because the ticket hasn't been created yet, but the doc has been uploaded. When I create the ticket, part of the create query will update all document records with ticketID=0 to ticketID=[new ticket's ID]. If there were no documents updated, then the query returns 0 records updated. Would this trigger a transaction failure? Since no action was taken? I'm assuming no, since the query didn't fail, it just didn't find a match. Just want to make sure this isn't evidence of this query NOT being run as a transaction.

I was even able to figure out how to create multiple records from an array. Previously, I had a function that was iterate through the array, and call the query for each element. Now the array is unnesting in the VALUES of the query. No matter how I asked how to do this, the function was the way I kept finding. I stumbled on unnest while looking at other stuff.

I'm getting really good at this stuff.

Post by **Cakedaddy** » Tue Aug 26, 2025 1:49 pm

Having a hard time coming up with an easy way to handle timesheets.

Currently, a tech will log time attached to a ticket. They select a ticket, log in and out as needed while working on that ticket. This allows me to track time spent on any given ticket. Working great.

Then for payroll, I grab that time and calculate total hours, OT, etc.

We pay OT for any hours over 8 in one day, and/or over 40 for the week.

So I take the total hours for the day, subtract 8, left overs are OT, etc.

This all works well when any given shift is on the same day. In other words, if I start at 8pm and work till 11:30, break for lunch and start back up at 12:30am and work till 5, the first half of the shift is on day 1, second half on day 2. So for the week, the first half of each shift is paired with the previous night's second half of the shift. This messes up the OT calculations.

I'm currently about to set it up where I have to go to each time entry and assign it to the day I want the time to apply to. But I feel like there's some better logic/formula that can be used to calculate times than doing that manually.

Post by **TheCatt** » Tue Aug 26, 2025 3:18 pm

So you need to track "shifts" not "days", is what it sounds like to me?

Post by **Cakedaddy** » Tue Aug 26, 2025 10:08 pm

Ok. . . . but how? I guess I assign that when I create the ticket? Except it's possible for one ticket to have multiple days of work associated with it, some being day, some being over night shifts.

Post by **Cakedaddy** » Fri Oct 17, 2025 4:15 am

Checking ChatGPT's query skills. It's now writing queries for me that are WAAAYYYY beyond my abilities. Once they are written, I can follow the logic, however, it would have taken me hours to research the commands, syntax, etc to write them myself. That being said, I have no idea if they are 'good' queries. So, here's the latest: How well written is this Postgresql query?

Code: Select all

WITH ordered AS (
    SELECT
        th.*,
        TO_TIMESTAMP(timestart, 'MM/DD/YYYY HH12:MI AM') AS ts_start,
        TO_TIMESTAMP(timeend,   'MM/DD/YYYY HH12:MI AM') AS ts_end,
        milesend - milesstart AS mileage_diff
    FROM tech_times th
    WHERE
        techid = {{techSelect.selectedOptionValue}}
        -- ✅ include one day before and after to capture cross-midnight shifts
        AND TO_TIMESTAMP(timestart, 'MM/DD/YYYY HH12:MI AM') >= TO_TIMESTAMP({{appsmith.store.STARTDATE}}, 'MM/DD/YYYY HH12:MI AM') - INTERVAL '1 day'
        AND TO_TIMESTAMP(timestart, 'MM/DD/YYYY HH12:MI AM') < TO_TIMESTAMP({{appsmith.store.ENDDATE}}, 'MM/DD/YYYY HH12:MI AM') + INTERVAL '1 day'
),
gaps AS (
    SELECT
        *,
        CASE
            WHEN ts_start - LAG(ts_end) OVER (PARTITION BY techid ORDER BY ts_start) > ({{Input1.text}} || ' hours')::interval
            THEN 1 ELSE 0
        END AS is_new_shift
    FROM ordered
),
shifts AS (
    SELECT
        *,
        SUM(is_new_shift) OVER (PARTITION BY techid ORDER BY ts_start ROWS UNBOUNDED PRECEDING) AS shift_id
    FROM gaps
),
shift_starts AS (
    SELECT
        techid,
        shift_id,
        MIN(ts_start) AS shift_start
    FROM shifts
    GROUP BY techid, shift_id
),
TimeDifferences AS (
    SELECT
        s.techid,
        s.shift_id,
        s.timecode,
        s.ts_start,
        s.ts_end,
        s.milesstart,
        s.milesend,
        s.timediff,
        s.mileage_diff,
        ROW_NUMBER() OVER (PARTITION BY s.techid, s.shift_id ORDER BY s.ts_start) AS rn,
        COUNT(*) OVER (PARTITION BY s.techid, s.shift_id) AS cnt,
        SUM(CASE WHEN s.timecode = 'T' THEN s.timediff ELSE 0 END) OVER (PARTITION BY s.techid, s.shift_id) AS traveltime,
        SUM(CASE WHEN s.timecode = 'O' THEN s.timediff ELSE 0 END) OVER (PARTITION BY s.techid, s.shift_id) AS onsitetime,
        SUM(CASE WHEN s.timecode = 'A' THEN s.timediff ELSE 0 END) OVER (PARTITION BY s.techid, s.shift_id) AS admintime,
        SUM(CASE WHEN s.timecode = 'B' THEN s.timediff ELSE 0 END) OVER (PARTITION BY s.techid, s.shift_id) AS breaktime,
        SUM(s.timediff) OVER (PARTITION BY s.techid, s.shift_id) AS alltimes
    FROM shifts s
)
SELECT
    td.shift_id,
    ROUND(SUM(
        CASE
            WHEN td.timecode = 'T' AND (td.rn = 1 OR td.rn = td.cnt) THEN GREATEST(CAST(td.timediff AS NUMERIC) - 0.5, 0)
            WHEN td.timecode = 'T' THEN CAST(td.timediff AS NUMERIC)
            ELSE 0
        END
    ), 2) AS commute,
    ROUND(SUM(
        CASE
            WHEN td.timecode = 'T' AND (td.rn = 1 OR td.rn = td.cnt) THEN GREATEST(CAST(td.mileage_diff AS NUMERIC) - 15, 0)
            WHEN td.timecode = 'T' THEN CAST(td.mileage_diff AS NUMERIC)
            ELSE 0
        END
    ), 2) AS commute_miles,
    ROUND(MAX(td.traveltime)::numeric, 2) AS travel,
    ROUND(MAX(td.onsitetime)::numeric, 2) AS onsite,
    ROUND(MAX(td.admintime)::numeric, 2) AS admin,
    ROUND(MAX(td.breaktime)::numeric, 2) AS break,
    ROUND(MAX(td.alltimes)::numeric, 2) AS totaltime,
    SUM(td.mileage_diff) AS miles,
    TO_CHAR(MIN(td.ts_start) AT TIME ZONE 'UTC', 'HH12:MI MM/DD/YYYY') AS shift_start,
    TO_CHAR(MAX(td.ts_end)   AT TIME ZONE 'UTC', 'HH12:MI MM/DD/YYYY') AS shift_end
FROM
    TimeDifferences td
JOIN
    shift_starts ss
    ON td.shift_id = ss.shift_id
    AND td.techid = ss.techid
WHERE
    -- ✅ only count shifts whose *start* falls within the pay period
    ss.shift_start >= TO_TIMESTAMP({{appsmith.store.STARTDATE}}, 'MM/DD/YYYY HH12:MI AM')
    AND ss.shift_start < TO_TIMESTAMP({{appsmith.store.ENDDATE}}, 'MM/DD/YYYY HH12:MI AM')
GROUP BY
    td.techid, td.shift_id
ORDER BY
    MIN(td.ts_start);

The query is for this table that calculates travel, overtime, etc.

table.jpg

Post by **Cakedaddy** » Fri Oct 17, 2025 4:21 am

ChatGPT has been immensely helpful in writing these queries. However, it's not perfect. I would ask it for something, and it would give me one. I would run it, but it wasn't quite right. I would explain the issues and it would correct them. However, there was one issue that it couldn't get without my help. It kept applying hours that were worked on Sunday morning on the following week's pay check, even though they were also included on the previous weeks paycheck. I kept telling it it wasn't working right, and it kept acting like it understood what I was saying and then saying "This should work!". I had to tell it that it wasn't checking the previous Saturday's hours to see if they were part of a shift that started on Saturday. It congratulated me on my great catch and corrected the query. So, you have to have SOME skill to use it, but again, incredibly helpful in building 99% of the query. And VERY polite.

Do you guys use it when programming?

OH! And it loves making suggestions for tweaks, etc. I have learned to ignore most of them because they eat up your free interactions and I run out before I'm done working! It also gave partial answers to some of my issues and then would ask if I wanted to full query so I could just copy/paste, eating a transaction. So it knows to 'waste conversation' so I run out and am tempted to pay to use. That being said, I have learned to save up my questions and ask detailed multi part questions to get multiple answers with one interaction. Two can play at that game. . .

Post by **TheCatt** » Fri Oct 17, 2025 7:43 am

It's certainly not a bad query. Looking at it, it uses relatively advanced stuff like CTEs and window functions. Everyone I interview, I quiz on window functions to see if they really understand SQL more than the basic CRUD stuff. I'm also not a postgres person.

Does it intentionally use UTC to compute shift_start/shift_end for a reason, is the only thing that immediately jumps out at me? Instead of EDT/EST?

Cakedaddy wrote: ↑Fri Oct 17, 2025 4:21 am ChatGPT has been immensely helpful in writing these queries. However, it's not perfect. I would ask it for something, and it would give me one. I would run it, but it wasn't quite right.

Yeah, that's basically how it works. It's guessing, and you're the validator.

Half our code-writing people use it to some degree. We use GitHub CoPilot, which integrates directly into Visual Studio Code. The downside, is exactly what you said. You have to become a babysitter.

Post by **Cakedaddy** » Fri Oct 17, 2025 5:35 pm

TheCatt wrote: ↑Fri Oct 17, 2025 7:43 am Does it intentionally use UTC to compute shift_start/shift_end for a reason

I don't know. It didn't tell me. . . but I'll ask! I assumed it's because it keeps the time zone generic. I don't like that it does that because although I don't display or use that field, it still throws me off when I'm troubleshooting and looking at the data behind the data.

When asked, I get this response.

Code: Select all

Excellent question — and you’re absolutely right to ask.  ** see how polite and positive GPT is!  **

The only reason I used UTC was because in many systems (especially web apps or servers) timestamps are stored in UTC to avoid ambiguity between time zones and daylight saving transitions.
Using AT TIME ZONE 'UTC' ensures we’re interpreting the stored value consistently, even if the database default time zone isn’t set.

And I guess that makes sense. If people in other time zones start using my app, will the prevalent use of EST throughout many of my queries mess things up?

But also, I'm sure there are a TON of mistakes in how I created my tables, fields, etc. The fact that I store dates as a string seems wrong. The fact that I do a TON of math in my queries, seems wrong. But that is due to limitations I face using the low code system I'm using. It allows java script, but it can be very clunky, and even impossible to do some calculations/functions/processes within the app's objects themselves. Like in the 'on click' field of an object. There are certain java script functions that it just doesn't support. Tables prevent you from using data that is in the table in certain ways. For instance, if I have a column in the table that subtracts the start time from the end time to get a time interval, I can't have another column reference that time interval column for other calculations, such as, subtract 8 to determine overtime. Stuff like that. So I had to move a bunch of the processing into the query.