data:image/s3,"s3://crabby-images/82a14/82a149ece899e4aac9485f4fe8aaa10847104f37" alt=""
Problem Description
There are a set of records which are not in sequence order and varies every month based on the input criteria. Also, along with Random data records, some records number would not be present in the list. The worker wants multiple files with unique records Randomly with a static interval value. The worker combines all sample files together and expecting un-duplicate records.
This is a very general problem in the reporting system wherever a sample report generated out of millions of records in random order.
The executives struggled to verify the whole set of records so they need few records in random order to validate the data.
Solution
The above problem is very general and having various solution to resolve it. The KW Team determine their own solution to solve this problem.
There are some Prerequisites which are given below –
- There must be a fixed interval value and that should not change over the year.
- The starting point should be different for the entire month like for January 01, February 02 etc. or it can be any unique number for every month during the entire year.
- There should be a sufficient record to generate the report per month.
Once above Prerequisites are decided, the user can generate sample report using the below steps –
- Retrieve the data from the database by using order by DBMS_RANDOM.VALUE function. This function is applicable to Oracle database. For example:
SELECT DISTINCT RECORDID FROM SAMPLE WHERE paymentDate=’01-01-2019′ ORDER BY DBMS_RANDOM.VALUE.
The above select SQL query will retrieve the same set of records in a different position on each execution. - A rankID can be assigned to entire records based on its retrieval not based on recordID sequence number.
- Insert the records into a table with the recordID and its corresponding rankID.
- Now start the picking the records from the above data set based on the rankID.
Suppose we are retrieving a report for the month of January then Starting point (seed) will be 1 and it traverses the entire data set with the interval value of 100. - The above steps help to generate a sample Report with random data for the first month.
- The user has to reuse the data from the table which store rankID corresponding to each record in first-month report generation. Retrieve the universe of data for consecutive months and insert non ranked records at the end of the table with incremental rankID. Generate the unique seeds value to pick the records to generate the report for the consecutive month.
- The above steps has to repeat for the entire month in order to generate the reports with unique random records. Please find the Flow diagram in below section to get detailed information.
Subsequent Month Sample Report Generation Process
The flow diagram depicts the process to generate/retrieve unique random records from the data universe. Please find it below and let us know your feedback.