HomeInterview QuestionsJSON format is available in the raw folder. The ta…

JSON format is available in the raw folder. The task is to read a particular file from that location and perform an operation to identify duplicates.

🟡 Medium Coding Mid level
1Times asked
Jul 2026Last seen
Jul 2026First seen

💡 Model Answer

To solve this problem in Python, you can use the built‑in json module or a library like pandas for convenience. First, load the file:

python
import json
with open('raw/data.json', 'r') as f:
    data = json.load(f)

Assuming data is a list of dictionaries, you can identify duplicates by converting each record to a hashable tuple of its values or by using a set of a unique key (e.g., a combination of fields that should be unique). For example, if duplicates are defined by the same order_id and timestamp, you can do:

python
seen = set()
duplicates = []
for record in data:
    key = (record['order_id'], record['timestamp'])
    if key in seen:
        duplicates.append(record)
    else:
        seen.add(key)

If you prefer pandas, it’s even simpler:

python
import pandas as pd

df = pd.read_json('raw/data.json')
dup_df = df[df.duplicated(subset=['order_id', 'timestamp'], keep=False)]

Complexity: The set‑based approach runs in O(n) time and O(n) space, where n is the number of records. The pandas method internally uses hash tables and also achieves linear time. This solution scales well for large files and can be extended to more complex duplicate criteria by adjusting the key or subset used.

This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.

🎤 Get questions like this answered in real-time

Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.

Get Assisting AI — Starts at ₹500