PyGraft is an open-source Python library for generating synthetic yet realistic schemas and knowledge graphs (KGs) based on user-specified parameters, built in collaboration with Université de Lorraine. PyGraft has the following features:
- possibility to generate a schema, a KG, or both
- highly-tunable process based on a broad array of user-specified parameters
- schemas and KGs are built with an extended set of RDFS and OWL constructs
- logical consistency is ensured by the use of a Description Logic reasoner (HermiT)
We expect PyGraft to help you generate new and tailored benchmark datasets useful in various fields and studies including but not limited to neuro-symbolic AI, link prediction, node classification, node clustering, ontology repairing, pattern mining, reasoning, scalability studies, fields whose data is sensitive or not readily available, etc.
More info:
Paper: https://arxiv.org/abs/2309.03685
Code: https://github.com/nicolas-hbt/pygraft
PyPI: https://pypi.org/project/pygraft/
All contributions or ideas to improve PyGraft are welcome.
Feel free to download, star, fork, share and tell us about any usage you foresee!