Tokern Catalog
Overview
dbcat scans and maintains metadata from all your databases and data warehouses. dbcat also stores metadata generated by other data governance applications such as PIICatcher and Lineage Engine. dbcat is typically used alongside other applications. It can also be used stand-alone to generate a very simple data catalog using the CLI or API.
dbcat stores the catalog in a Postgresql or SQLite database. By default, the catalog is stored in a SQLite
database in ~/.config/tokern/catalog.db
The catalog can be exported to Datahub or Amundsen. This is very useful to export PII tags or column lineage generated by PIICatcher or Lineage Engine. Check documentation for detailed instructions to set PII tags and column-level lineage.
We need your help!
We are currently gathering feedback on our Tokern project, and your input would help us greatly in shaping our roadmap for the future. If you are interested in submitting your input for the development of dbcat, take our survey here!
Quick Start
dbcat is distributed as a python application.
python3 -m venv .env
source .env/bin/activate
pip install dbcat
dbcat catalog add-sqlite --name sample --path <path to sqlite db>
dbcat catalog scan --source-name sample
Supported Technologies
The following databases are supported:
- MySQL/Mariadb
- PostgreSQL
- AWS Redshift
- BigQuery
- Snowflake
- AWS Athena