TIME SENSITIVE PROJECT Description: I have approximately 200 reports in Excel and PDF format. These reports contain tables or structured/semi-structured data, but the formatting, field names, and file naming conventions vary significantly across files. I'm looking for a skilled data analyst or Python developer who can help me compare these reports and identify which ones are at least 60% similar in content. This will require fuzzy matching techniques and possibly data normalization. Responsibilities: Extract data from PDF and Excel reports (some may require OCR or table parsing). Clean and normalize the data across all files. Compare the reports and determine which are ≥60% similar based on data content. Deliver a summary of matched report pairs or groups with similarity scores....
Keyword: Data Processing
Delivery Time: 2 days left days
Price: $481.0
Data Mining Data Processing Excel Python Software Architecture
Descripción del proyecto: Solicitamos un desarrollador con experiencia en AppSheet para crear una aplicación que permita al ajustador designado por la aseguradora realizar inspecciones técnicas a viviendas afectadas por lluvias o sismos, con el fin...
View Job