"DataMetaMap" is Python library designed to represent various multiple datasets in the same vector space for comparision them with each other. Library is offering a suite of advanced datasete embedding techniques compatible with PyTorch.
We need an ability to compare information similarity between various datasets. If so, we can find the most similar dataset to our target task dataset. Choosing the best pretrain neural net on it can narrow down the choice of potential candidates for pretrain.
- Maximum Mean Discrepancy, also see 📝 review
- Task2Vec, also see 📝 paper
- Dataset2Vec, also see 📝 paper
- Wasserstein Task Embedding, also see 📝 paper
TODO
TODO
TODO
TODO
TODO
- Vladislav Minashkin (Project planning, Benchmarking, Algorithms)
- Papay Ivan (Documentation writing, Code writing, Algorithms)
- Meshkov Vlad (Blog post, Demo, Algorithms)
- Stepanov Ilya (Tech. report, Code writing, Algorithms)
- You are welcome to contribute to our project!
Пока что тут ничего нет