Project information
- Category: Data Science
- Employer: São Paulo State Treasury Department / University of São Paulo (USP)
- Project date: Jan 2024 - Oct 2024
- Challenge: Develop an automated system to extract and process invoice data (fiscal notes) to compare with taxpayer-declared information regarding the Municipal Participation Index (DIPAM). DIPAM is an important Brazilian fiscal indicator that determines tax revenue distribution among municipalities based on reported economic activity. This would be comparable to the methodology used in tax revenue allocation to local councils.
Automation of Fiscal Data Extraction and Analysis
Approach:
The project implements an ETL (Extract, Transform, Load) process to automate the extraction of invoice data. The system processes multiple types of fiscal documents, aggregating information by document type, date, and municipality. Key automation steps include: Extracting structured data from invoices issued across multiple sources. Performing data validation and comparison against taxpayer declarations. Aggregating revenue indicators at the municipal level for DIPAM analysis. Implementing advanced data processing techniques to detect inconsistencies. In the initial phase, the extracted data is used to compare declared versus actual invoice-reported revenues. The second phase aims to replace the taxpayer’s declaration altogether, eliminating manual submission. Taxpayers will only need to review and confirm the automatically generated figures.Results:
About São Paulo State Treasury Department
The São Paulo State Department of Treasury (SEFAZ-SP) oversees US$ 55 billion in annual taxpayer revenue, making it the top ICMS (comparable to the GST) revenue collector among Brazilian states.With around 8,000 employees, SEFAZ-SP combines modern public management practices with advanced technologies to provide high-quality services both in-person and online.
Its operations span the entire state, with 18 regional tax units, numerous tax posts, and service centers in all 645 municipalities.