Skip to content

ADQA - Autonomous Data Quality Agent

Data Ingress

Data Ingress

ADQA supports a wide variety of data sources through its data_ingress module.

Supported Formats

CSV: Local or remote CSV files.
JSON: Local or remote JSON files.
Parquet: Local or remote Parquet files.
Excel: Local Excel workbooks.
SQL: Database connections (PostgreSQL, MySQL, SQLite, etc.) via SQLAlchemy.
PyAirByte: Integration with Airbyte connectors for 300+ sources.

Using `DataSource`

The easiest way to initialize ADQA is via the DataSource factory or the convenience method ADQA.from_path().

Example: CSV

from adqa import DataSource
source = DataSource.csv(path="data.csv")

Example: SQL

source = DataSource.sql(
    uri="postgresql://user:pass@localhost/db",
    query="SELECT * FROM my_table"
)

Internal Workflow

Factory Identification: ADQA identifies the reader based on the file extension or URI prefix.
Reader Instantiation: A specialized reader (e.g., CSVReader) is created.
Lazy/Eager Loading: Data is loaded into a pandas DataFrame for processing.