New topic Closed topic
avatar image
1
Best way to find data duplicates?
By Created , last editted

My application receives a new import daily. In this import are all records from another system.(so my system receives new and old rules).



I've got 2 identical models:

1 Importrule
2 Importrule_log



Whenever I process a rule I havn't seen before I duplicate it to Importrule_log. Then the next time I receive it in my import I can check in the Importrule_log if I have processed this rule before and set it to done without it going through the whole process.

The importrule model has over 30 properties which i want to compair. My goto solution was to create a object in which i search in the importrule_log with the 30 properties. This doesnt give me any object eventhough data is the same.(I've logged both data values)(some properties are empty/null could this be a problem when compairing?).

My other idea was generating a giant string for each rule in which I put all the properties and use that for searching in the importrule_log. 


Which solution is better? / Is there another solution? (Deduplication on my import is not an option because of other reasons)



My application receives a new import daily. In this import are all records from another system.(so my system receives new and old rules).



I've got 2 identical models:

1 Importrule
2 Importrule_log



Whenever I process a rule I havn't seen before I duplicate it to Importrule_log. Then the next time I receive it in my import I can check in the Importrule_log if I have processed this rule before and set it to done without it going through the whole process.

The importrule model has over 30 properties which i want to compair. My goto solution was to create a object in which i search in the importrule_log with the 30 properties. This doesnt give me any object eventhough data is the same.(I've logged both data values)(some properties are empty/null could this be a problem when compairing?).

My other idea was generating a giant string for each rule in which I put all the properties and use that for searching in the importrule_log. 


Which solution is better? / Is there another solution? (Deduplication on my import is not an option because of other reasons)


Answers
Sort by:

This topic is closed.