https://nyaa.si/view/1442127
This is the most recent database dump for trace.moe which contains image hashes of 31000 hours of anime. (No video files included)
As mentioned previously, the hashing libarary has upgraded. So these is a complete re-hash of all anime, and these hashes are incompatible with sola. It is only compatible with latest liresolr. Comparing to last DB dump, this data set has recent anime added, and has replaced many subbed versions with raw anime.
These can be loaded into a local liresolr database for anime scene search. You can follow the project on GitHub if you are interested.
https://github.com/soruly/trace.moe
some anime not found previously can now be found after these updates and a re-hash of all anime. See details below:
Upgraded from java-1.8.0-openjdk (java 8) to java-latest-openjdk (java 17) for worker nodes
Upgraded from solr 7.5.0 to 8.9.0
Upgraded liresolr with image cache issues fixed and updated LIRE.
https://github.com/soruly/liresolr
Upgraded LIRE from 1.0_b05.jar to 1.0_b06.jar This is a breaking change so it requires a re-hash of all video.
https://github.com/soruly/LIRE
The 83203 videos have just been re-hashed with an latest version of liresolr. This should fix some broken video and timecode.
An archive of the hashes (~23GB) would be published in October.
Accuracy is tuned up from 2% to 3%, which takes 300-500ms longer to search on average.
You can search images directly from windows context menu by installing this powershell script. See instructions below
https://nyaa.si/view/1393270
This is the most recent database dump for trace.moe which contains image hashes of ~5000 anime titles. (No video files included)
Comparing to last DB dump, this data set has recent anime added, and has replaced many subbed versions with raw anime.
These can be loaded into sola database for searching locally. You can follow the project on GitHub if you are interested.
https://github.com/soruly/sola
https://soruly.github.io/trace.moe-api/
The new trace.moe API is finallized. There are quite a lot of changes so please read the above API docs. It also has a migration guide for developers to update their program. The old API is planned to shut down on 30th June 2021, depending on how fast developers migrate their programs.
According to the API changes, the sponsor tiers have also changed. Starting from next month, existing patrons would receive a email of your trace.moe account according to your sponsor tiers. If you can't wait to try it out, you can also message me to get an early access.
You can also choose to use GitHub sponsor if you perfer. The sponsor tiers are exactly the same. But you'll have to email me your GitHub ID and email address to claim the rewards for sponsors.
Traffic Graph You can now see the server's traffic on https://trace.moe/about
Account Page You can check your search quota and limits on https://trace.moe/account
Making Anilist info optional Actually crawling anilist data violates Anilist API's terms of service. Instead of crawling everything upfront, I think it's better to query anilist data on-the-fly when necessary. The website and telegram bot has updates to behave like this. In this way, this would reduce future API changes due to anilist's data structure. The chinese translated titles are also separated from my database now. So it can run as a standalone proxy that injects chinese titles on-the-fly. If you're interested in serverless/cloudflare workers, take a look at this repo. https://github.com/soruly/anilist-chinese
Rate Limits In recent months I've been working hard to regulate the traffic to avoid crashing the server due to overload. Apart from rate limit and concurrent search limit, I've also added a queue of database search. Sadly, the queue still often gets full during rush hours, especially when there are a few slow search requests that blocks the rest of the queue. I'm still evaluating the parameters so the actual numbers are not finalized yet. I'm expecting to complete and release the new API in 1-2 months. So for those whom may concern, join the discussion on Discord https://discord.gg/K9jn6Kj
I've rewritten the webpage.
Now it shows animated GIF preview for all results instead of static thumbnails. Generating that many animated previews is cpu-consuming, so the number of results are capped to top 5 only.
On top left, you can see your search image that is used for searching. Previously this was drawn behind canvas which is a bit confusing.
Now it only shows important informations from anilist. Details like staff/characters are removed, please visit anilist for details if you're interested.
Site navigation is moved from top to bottom of the page, including all important urls to related sites.
This update also dropped old browsers like Internet Explorer. If you're having issues with the new website, please report to me.
The source code of the website is now located at https://github.com/soruly/trace.moe-www
- Change: image preview endpoint has changed, the url is now consistent with video preview
- New: size param for image/video preview
Please refer to https://soruly.github.io/trace.moe/#/#previews
This is the most recent database dump for trace.moe which contains image hashes of ~5000 anime titles. (No video files included)
Comparing to last DB dump, this data set has recent anime added, and has replaced many subbed versions with raw anime.
These can be loaded into sola database for searching locally. You can follow the project on GitHub if you are interested.
https://github.com/soruly/sola
A new search algorithm has been added to trace.moe. You can try this using the "Use new algo" check box near the search button. This new algorithm has better support on flipped and cropped images. See results above.
This new algorithm still needs fine-tuning of its parameters, so I'm now opening this option for everyone to try. For developers using the API, you can use /search?method=jc . In upcoming months, I'll review the performance and accuracy of these two algorithms and see if JCD is good enough to replace ColorLayout. Or else, methods to combine the two.
The new algorithm, JCD (Joint Composite Descriptor) is composed of CEDD (Color and Edge Directivity Descriptor) and FCTH (Fuzzy Color and Texture Histogram), takes both color, edge (shape) and texture into image analysis. trace.moe has been using ColorLayout descriptor, which does not analyze the image edge and shape but only the distribution of colors in a 8x8 grid. With this new algorithm, it makes it possible to search for flipped/cropped images which previously fails.
Note that none of these Image Descriptors are invented by me. If you're interested in the principles of the the algorithm(s), please read the paper from the original author. and its open source implementation, LIRE.
When indexed on the same video data set, the size of indexed database (extracted) is 232GB, 85% larger than that of ColorLayout (125GB). The database cache of two algos and memory for apache solr now occupies 420GB RAM (out of 512GB) RAM on server.
Serving the database for the new algorithm requires powerful servers. Please continue to support this project for ACG fans! :3
This is the most recent database dump for trace.moe which contains image hashes of ~100,000 anime files. (No video files included)
Comparing to last DB dump, this data set has recent anime added and versions from different subgroups de-duplicated. For most anime, only one version of the same episode is kept in db. This results a smaller database size and faster search time.
These can be loaded into sola database for searching locally. I'm still modifying sola and document the the steps how to do so. You can follow the project on GitHub if you are interested.
https://github.com/soruly/sola
If your OS is in dark mode, it would switch to dark mode automatically without any settings.
For windows, it'd be in settings => personalization => color
For macOS, it'd be in system preferences => general => appearance
You can now easily use http://trace.moe API directly with image URL
https://soruly.github.io/trace.moe/
This is the most recent database dump for trace.moe which contains image hashes of ~100,000 anime files. (No video files included)
These can be loaded into sola database for searching locally. I'm still modifying sola and document the the steps how to do so. You can follow the project on GitHub if you are interested.
https://github.com/soruly/sola
Recently I've been improving webpage loading times in various ways:
- Re-write webpage js to remove heavy libraries like jQuery / bootstrap.js
- Use HTTP/2 Push to reduce round-trip times
- Upgraded Cloudflare CDN to pro plan ($20/month)
Now the webpage can complete loading in just 90ms!! (from regions close to origin server)
trace.moe is now scoring 98 and 100 in Google PageSpeed Insights.
I've also improved website security to score A+ (115/100) in Mozilla Observatory test.
Recently I've found that Cloudflare's free plan did not route to nearest edge server due to it's network capacity. When traffic is busy for the region during that period, it may route to thousands of miles away which increase page load time by hundreds of milliseconds. That's why I've upgraded to Cloudflare pro plan to ensure the website is consistently fast from the whole world. I hope your support can cover the increased cost.
New web UI layout for mobile devices!
A few days ago, I redesigned a the webpage layout a bit to better support mobile devices. (finally!) Though the layout isn't very perfect, at least it's easier to use and browse compared to the fixed viewport before. Try the webpage now at https://trace.moe
Natural Scene Cutter for video preview
I started a new project late October 2018 to cut video scene previews naturally, by detecting timestamp boundaries of a scene. This method is better than the fixed time offset cutting. So you can loop the video of the scene. I'm still writing description of this project, which you can find it on Github https://github.com/soruly/trace.moe-media
GIF Preview for Telegram bot
To try the natural scene cutter, you can use the Telegram bot.
https://telegram.me/WhatAnimeBot
Send an image with caption "mute", and it will return a muted video preview (same as GIF), which can loop a scene.
API updates
The natural scene cutter can also be used via API now.
The API and API docs have updated to resolve some confusions around search image format and error handling issues. Now the API has removed strict restrictions on image format and is supporting both JSON and FORM POST. Details are written in https://soruly.github.io/trace.moe/
Dockerizing sola
sola is now mostly dockerized. It is a lot easier for developers to setup their own video scene search engine now. Instructions are written in details in the project:
https://github.com/soruly/sola
The domain incident hit this project quite hard last October. But with the help of fans and developers around the world, the daily search queries has now restored to same levels as half a year ago. Thank you for all your support!! ;)
Yesterday, the .ga domain provided by Freenom (domain name registrar) was suddenly suspended. I had no choice but immediately moved to trace.moe which I've been planning to move to.
As of today, officially supported integrations (WebExtensions, Telegram bot), and some major integrations (Discord, SauceNAO) has been updated to the new domain. In upcoming days I'll work with developers of remaining 3rd party integrations (including API users) to notify the change.
Why the name trace.moe?
This search engine tells you more than the anime name, but actually trace back the moment of the scene. And quite a number of users actually need "time tracing" even they already know the anime. That's why I think using "trace" instead of "whatanime" better describes this search engine.
The last piece of puzzle in whatanime.ga has published.
The whatanime repository on GitHub has gain lot of developers' attention. However, it only include the code for the website and is not a fully functional system. Some of them are puzzled when they try to look into the code.
Last month, I've re-written those messy scripts into a new project - sola.
This include all the scripts that whatanime.ga currently used to put mp4 video files into solr for search. It has been running for a month, so this should be stable enough to publish. sola is not limited to searching anime, any types of video (like movies, TV shows) can also be indexed. It does not depends on whatanime and anilist, the only dependeny is liresolr. So users that doesn't need anime info and web UI can avoid those complexities.
sola is a node.js app. Many developers would feel easy to read and modify the code. Hopefully this would encourage contribution to the project. It has just ~700 lines of code in total. For developers that doesn't like Javascript (or would like to avoid GPLv3), it's not difficult to rewrite the whole thing.
I've written a setup guide for developers to easily setup their own video scene search engine. Spread the news to developers and let them try!
https://github.com/soruly/sola
https://youtu.be/HjL5O3k3C7s
New testing site on https://beta.whatanime.ga
I've been testing a new beta site with improved performance on search speed.
A copy of the 130GB solr core from http://whatanime.ga is split into 10 smaller cores (10-16GB each, with 59-89 million hashes). Each search queries all 10 solr cores in parallel and then merge back. Early tests shows that it can query 950 million image hashes in 1.35 seconds! Almost 10 times as much than the existing one.
I've rewritten a lot of backend scripts into a nice CLI tool. I'm going to publish it on github later this month, after I've completed the docs and tutorial. For experienced developers, it will be a lot easier to setup their own video reverse search engine.
This will automatically detect and crop black borders on search image, significantly increase the accuracy on bad screenshots
This is applied to both web, telegram bot and all API clients
This is achieved by openCV using a simple python script
You can subscribe the Telegram Channel for database and news updates.
You can now send GIF or video to the Telegram Bot.
You can now add the Telegram Bot to Telegram group. Use @ to mention the bot on any photo to search.
Identified and fixed an issue where browsers failed to search due to blocked scripts.
Database dump updated to 2017-04.
Take a look at a demo on demo.whatanime.ga
This is the new icon for whatanime.ga
Want to know more about whatanime.ga? Read the Presentation slides on June 2017.
You can now search scenes with any aspect ratio. Thumbnail preview also respect aspect ratio now. Recaptcha is removed, you must wait up to 10 minutes once you have reached search quota limit (20 search per 10 minutes). Homepage code has been re-written, the webpage now loads faster. And a new loading animation was added.
You can now keep searching the database for more results. Previously, the search would stop when it has found any result > 90% similarity. Now keep searching to discover more results with even higher similarity!
You can now select a particular year / season to search. If you like this project, feel free to Support whatanime.ga on Patreon.
Server upgrade and cleanup was completed on 15 Apr 2017. An additional hard drive and new network adapter has been installed.
You can now see the system status in https://status.whatanime.ga (Powered by UptimeRobot).
Anime info panel was not showing since Feb 21 21:13 UTC , the service has been restored on Feb 23 03:34 UTC.
The image proxy server has been moved from Singapore (Digital Ocean) to Tokyo (Linode). It may affect loading times of images from ?url= params.
The Telegram Bot will now sends you a video preview.
The official API is now open for testing. Interested developers may read the page on GitHub Pages.
The image extraction method of WebExtension has changed. This would be able to fix some issues on grabbing the correct image on webpage to search. Now the Extension also supports Microsoft Edge. You may try loading the zip from GitHub by enabling extension developer features in about:flags.
Now mobile devices will mute and autoplay the video preview.
You can now use URL params to control playback options, for example:
https://trace.moe/?autoplay=0&loop&mute=1&url=
Reduced search result candidates from 10 Million to 3 Million. This would reduce accuracy but greatly improves performance.
You can now download a complete dump of the database.
You can see server load and recently indexed files in /about
The database has started indexing raw anime from now on.
This telegram bot can tell you where an anime screenshot is taken from. Just send / forward an image to https://telegram.me/WhatAnimeBot .
The ?auto url param is no longer used. Now it would always automatically search.
WebExtension has updated. Now it would copy and paste using dataURL in background. It allows searching images from pages that's not publicly available such as Facebook Feeds. It also supports searching from HTML5 videos using the frame extracted at the moment you click it. Now extensions no longer use the ?url to send search images.
Chrome Extension, Firefox Add-on, Opera Add-on has been relased.
Source code available on GitHub
It now shows anime titles according to users' browser language.
It now shows a loading icon (instead of blurring) while searching. Also display a loading icon when the video preview is loading.
The thumbnail may not be at the exact moment, since the seeking is not very accurate. Play the preview to see if it's what you are looking for.
In case the image cannot be loaded, upload the image from file or copy image itself (not URL) then Ctrl+V
Images load from URL would be compressed. This would speed up loading GIF and large images from URL.
Server will now cache some search results. Search results would be cached for 5-30 minutes. The better the search results, the longer the results would be cached.
Once the image completes loading, it would search automatically. You can change the setting in Chrome Extension. Also see how you can search in Firefox in FAQ.
You can now see how much data is cached in RAM from About page. The higher the percentage faster the search. It usually stays around 33% due to limited RAM.
There has been some incorrect timestamp in search results due to image analyze scripts parsing outputs of ffmpeg incorrectly. The script has been updated now. From now on new animes should have a correct timestamp, while it would take at about two months to fix already indexed animes. The new script is also 33% faster when indexing anime.
Image would sometimes gone black when clicking fit / flip button. Now fixed.
You can now choose to Fit Width / Height for your search image. Also fixed some flickering issue on previews.
Fixed some cross-site image URL linking issue. Most image URL should load now.
A better layout for more Anime information.
You can now search by Image URL
The Safe Search Option can hide most Hentai Anime from search result. But you should aware that some regular season Animes can still be obscene. (NSFW)
Now you can use the Chrome Extension to search.
About 168 Heitai Anime series has been added.
To help users to understand how the search engine works, we have added some good and bad screenshots in FAQ.
Improved caching method to warmup cold data.
Database has been cleaned up and reloaded. This should fix most video previews. Fixed some anime titles still being null.
Fixed the empty search result. The issue has been resolved. A large number of files has been relocated, video preview may be missing for some search results.
Increased cache size to improve performance when the server has been idle for a long time.
You may now flip the image before searching. If you can't find a match, try to flip your image and search again. (Especially useful for AMV)
Switched to use a new searching algorithm. The search is slower but more accurate.
Adding some informative pages.