Thread

Roopak A N

Full Stack Engineer

Jun 14, 2018

Upsert vs (Update/Insert)

I have a table in which a column has to be incremented. If the record does not exist, the record needs to be created with value 1.

I have 2 thoughts on this -

Use an upsert query - "insert into ... value (..., 1) on duplicate update set col = col+1"
Try to update the record. If it does not exist, an error will be thrown, this can be handled and a record is created.

In my use case, the record not found would be a rare case. Could happen once in 1 or 2 months. I feel it is wise to use 2nd approach. Following are the reasons -

an assumption that upsert queries are slower than update query.
why do a slow query everytime (once in every 10 minutes) for handling events which happen once in a month

Please share your thoughts.

#mysql #sql

Responses(1)

Mark

formerly known as M

Jun 14, 2018

In general, you should write what will be the cleanest code, and only optimize it after verifying through measurement that 1) there is a performance problem and 2) this is actually faster.

It seems like a reasonable first assumption to expect that the database does pretty much what you would do by hand, but without the double roundtrip. I think the amount of data sent would be equal when using prepared statements too.

This is not to say that the above paragraph is the answer, just to show that conflicting theories can seem reasonable and that you always need to measure.

Measuring is always important. Personally I use generated data for data-driven testing. Not to mention differences of dynamic field types that allocate disc space dynamically vs fixed length / size fields.

And ofc the index length in theory with btree index it should be O(log(n)) but on a hashmap index it's O(1) it's really tricky to generalize on behalf of databases without looking at the engines / protocols etc.

The memory consumption of multiple connections, locks between concurrent updates / inserts.

anyhow I would be curious about some analytic tests and behaviour analysis between the different databases and engines. But I am talking on mechanics level not on 'random benchmark' levels. In the end: understanding the causation and not just noticing a correlation.

Search Hashnode

Upsert vs (Update/Insert)

Responses(1)

Recent threads