AnimeTimm banner

AnimeTimm

AnimeTimm is a DeepGHS project for training, testing, and sharing timm-based vision models for anime-style and illustration-focused image tagging.

It is part research playground, part anime-fan workshop: we care about reproducible datasets, model cards, ONNX exports, and practical demos, but the models are also built for people who actually work with 2D art, tags, characters, and visual search.

Project Stewardship

AnimeTimm is produced and maintained by the DeepGHS team and contributors.

This Hugging Face organization is the focused publishing home for AnimeTimm releases: model checkpoints, selected training datasets, and interactive Spaces. The upstream engineering work is connected to the DeepGHS GitHub organization, including the deepghs/animetimm repository.

What We Build

Featured Dataset

danbooru-wdtagger-v4-w640-ws-full

The main public dataset release used by the dbv4-full model family. It is a Danbooru-derived WebDataset build for large-scale anime-style multi-label tagging, with images resized so min(width, height) <= 640.

Split Images Total Size
train 5,321,713 318 GB
test 295,926 17.7 GB
val 296,957 17.8 GB
total 5,914,596 353.5 GB

Each sample contains the image as webp plus JSON metadata: id, width, height, rating, general_tags, and character_tags. The selected label space has 12,476 tags: 9,225 general tags, 3,247 character tags, and 4 rating tags.

Model Zoo: dbv4-full

The tables below focus only on the main dbv4-full model line. Metrics are copied from the corresponding model cards and use the test split reported there.

dbv4-full model performance and parameter snapshot

Top 5 By Macro F1

Rank Model Family Params Macro@Best F1 Macro@0.40 F1 Micro@0.40 F1
1 convnextv2_huge.dbv4-full ConvNeXt 692.6M 0.611 0.580 0.697
2 eva02_large_patch14_448.dbv4-full EVA 316.8M 0.599 0.569 0.693
3 caformer_b36.dbv4-full CAFormer 134.0M 0.581 0.546 0.689
4 swinv2_base_window8_256.dbv4-full SwinV2 99.7M 0.575 0.541 0.683
5 caformer_m36.dbv4-full CAFormer 82.7M 0.559 0.515 0.676

Representative Models By Backbone Family

Each row is the best dbv4-full model currently published for that backbone family.

Family Model Params Macro@Best F1 Macro@0.40 F1 Micro@0.40 F1
ConvNeXt convnextv2_huge.dbv4-full 692.6M 0.611 0.580 0.697
EVA eva02_large_patch14_448.dbv4-full 316.8M 0.599 0.569 0.693
CAFormer caformer_b36.dbv4-full 134.0M 0.581 0.546 0.689
SwinV2 swinv2_base_window8_256.dbv4-full 99.7M 0.575 0.541 0.683
ViT vit_base_patch16_224.dbv4-full 95.8M 0.540 0.500 0.664
MobileNetV4 mobilenetv4_conv_aa_large.dbv4-full 47.3M 0.511 0.458 0.641
MobileNetV3 mobilenetv3_large_150d.dbv4-full 29.3M 0.462 0.400 0.605
MobileViT mobilevitv2_200.dbv4-full 30.2M 0.454 0.401 0.608
ResNet resnet152.dbv4-full 83.7M 0.486 0.448 0.624

Try It

Maintenance

The source data and chart builder are stored in this Space repository so the organization card can be regenerated without guessing:

Acknowledgements

Notes

These releases are research and hobbyist infrastructure for visual tagging. Please check each model or dataset card for license, source data notes, intended use, and audience restrictions before reuse.