General questions

What does do? sits alongside your browser and gives you an opportunity to explore the photos that you encounter while browsing. For any of the photographs that are part of the catalog, it can tell you information about the author of the photo, where it's from, and what permissions you could have to use it.

What photos do you store information about?

We currently have information about photographs from Wikimedia Commons, a collection of approximately 23 million photographs (November 2014). The photographs on Wikimedia Commons in turn come from a wide variety of sources: the authors themselves, imported from Flickr, or other sources. is constantly expanding on the number of photographs we have in our database, and the aim is to include as much as possible of all openly licensed photos.

So only openly licensed photos then?

Yes. We strongly believe in the open ecosystem and believe that answering the basic question: is this openly licensed? for any photograph is important to facilitate the ecosystem. For photographs that you can identify through, you can be reasonably certain that they are indeed licensed in the way indicated (public domain, a Creative Commons license or similar).

Can I include my own photos?

We'd love to include your photos, or any collections that you're responsible for, so we can make them findable through Please contact us for more information. Please note that at this time, we're primarily looking at including larger collections of openly licensed photographs, and the more work you're willing to do yourself on making them available to us - or creating a project together to do the work - the more likely it is that we can include them quickly.

Why is this even important?

From a practical perspective, knowing who authored a photograph and what terms and conditions it's available under is important in order to understand how you may use an photograph, and what attribution you must provide when you do.

What we believe is that having this information available also place the photograph in its right cultural context. Knowing where it's from and who authored it contributed to our understanding of a photograph, it increases its value, and makes us relate to it in a different way. We think that if people had this information available about any photograph they encountered online, the need to enforce and assert copyright would also be less, and over time, we can create an ecosystem where mutal respect and common sense take precedence.

What's your privacy policy?

We take every precaution to keep your information safe, and we don't purposefully collect any more information than we need to do the actual processing on your behalf. We also remove information as soon as we're done with it. You can read more in our Privacy Policy.

Image matching

Why doesn't match X?

The matching algorithm that powers is based on Blockhash, a free and open source software implementation of an photograph matching algorithm. In our research, we've found that this algorithm gives the best possible matches and it regularly outperforms many other algorithms. That doesn't mean it's perfect though!

Blockhash is less perceptive of landscape photographs, and in general any photograph which includes large amounts of similar colors and contrasting areas. For instance, a photograph which includes the open sky in its upper half, and some buildings or other features in the lower half, has an area of very similar color (the sky) and significant contrast (between its upper and lower half). Such photographs, from an algorithm point of view, are very similar to each other, and creating good matches are almost impossible.

The Blockhash algorithm also doesn't deal well with modifications to photographs. We've set the bar at verbatim re-use. Photographs that are resized or are changed from JPG to PNG or similarly, but still retain the same aspect ratio, and the same photograph, are the photographs that the algorithm is designed to match. The moment you crop the photograph, add borders, change the colors, or do other manipulations, you're creating a derivative work, and will most likely not match this.

Why don't you use algorithm Y?

In our work with, we've evaluated a number of different algorithms. Chances are, Y is one of them, but we'd still love to hear your thoughts about it! If you have a compelling algorithm, we may consider switching to it. Here are the considerations and conditions we place on our algorithms:

  • They should not be encumbered by patents, and we should be able to implement them in Free and Open Source Software, like
  • They should work well in a browser, meaning that they should be reasonably fast and easy to implement in JavaScript.
  • They should generate a hash value, with the particular characteristics that small changes to the photograph (resize, etc) should result in small changes to the hash.
  • Over a random sample of photographs, they should generate no or very few false positives (photographs that are matched against each other despite being different) and no or few false negatives (photographs that doesn't match, despite that they are the same).
  • In general, they should prioritise generating no false positives, since a false positive is worse than a false negative.

When we evaluate an algorithm, we generate hashes against a representative set of photographs that we have on file. We do two tests. In the first test we transform the photographs (resize, format change, cropping, add borders, rotate), and we test to see how much we can transform a photograph while still matching the original. In the second test we generate hashes for a large amount of photographs and we cross compare them against each other. We calculate the hamming distance (number of bits that differ) between the hashes and find out how often we would get false positives.

But can you please try algorithm Y?

If you've looked at our considerations and conditions and think that you have an algorithm that's worth trying, we would still love to test it. The easiest way is that you fork Blockhash on Github and create an implementation with your algorithm. You can then submit a pull request and/or create an issue with a reference to your fork, to allow us to pull your code to test it.

Database information

How many photographs are in your database?

As of the 18th of November 2014, we have information of 22,412,293 photographs in the catalog. These photographs are exclusively from Wikimedia Commons and as such represent a very wide variety of photographs, most of which are already used elsewhere in Wikipedia.

What information do you store?

We store information about the author, title and license information about each photograph. In addition to this, we record the URLs of the photograph as well as at least one thumbnail or smaller resolution photograph. This is the information that we get from Wikimedia Commons.

We supplement the information from Wikimedia Commons with our own calculation of a Blockhash value for each photograph.

Contact information

Who runs this?

The initial work of was provided by Commons Machinery, funded by the Shuttleworth Foundation. The project is run as a free and open source project by Jonas Öberg. Hosting is currently provided Commons Machinery.

Can I email you?

We'd love to hear from you! You can email us at

What other ways can I get in touch?

Email is really the key to our hearts, but you can also reach out to us on twitter (@elog_io) or by sending post to us at:
Åsgatan 5
646 32 Gnesta