Create your own word cloud from any text to visualize word frequency.

Frequently Asked Questions

  1. What is TagCrowd?
  2. How do I make a word cloud?
  3. Can I use TagCrowd for commercial purposes?
  4. Is there a TagCrowd API? Can I use a script?
  5. Is the data I input confidential?
  6. How do I create an image or PDF of my cloud?
  7. How do I customize the look of my word cloud?
  8. How do I keep multiple words together (e.g. 'New York')?
  9. Why is a certain word not showing up in my cloud? How do I prevent that?
  10. Does TagCrowd work for non-English languages?
  11. Does the URL input visualize one web page or the whole domain?
  12. How can I make a cloud of a very long text file?
  13. How do I embed my word cloud in a web page?
  14. Can I install TagCrowd on my own server or computer?

What is TagCrowd?

TagCrowd is a web application for visualizing word frequencies in any text by creating what is popularly known as a word cloud, text cloud or tag cloud.

It was created by Daniel Steinbock while a PhD student at Stanford University.

A word cloud is a beautiful, informative image that communicates much in a single glance. TagCrowd specializes in making word clouds easy to read, analyze and compare, for a variety of useful purposes:

The list goes on and continues to grow.

How do I make a word cloud?

There are three ways of entering text into TagCrowd to generate a word cloud:

  1. Enter the URL for a web page you wish to visualize,
  2. Paste (or type) the text you wish to visualize into the text box,
  3. Upload a plain text file to visualize.
There's a 5 megabyte file size maximum on uploaded and URL-linked files, and a 500 kilobyte limit on pasted text. Only plain text files are accepted (or HTML files if providing a web page URL).

After providing your text source, hit the Visualize! button to see the result with the default options. You now have several options available to tweak your cloud into a form you're happy with.

Language of text:

Choose the written language of the text you are visualizing. TagCrowd maintains a list of common words (called a 'stop list') for each supported language so these words won't show up in your word cloud. If you wish to turn this function off, select 'none' for the language. If there are additional words you want to remove from your word cloud, see "Don't show these words" below.

Maximum number of words to show:

TagCrowd works by counting the frequency of every word in your source text and visualizing the top N of these as a word cloud. You set the value of N with this field. The appropriate value will depend on your application and the size of your source text. In general, it's better to use smaller clouds for shorter source texts and larger clouds for longer source texts.

Minimum frequency:

Words must appear at least this number of times in order to show up in the word cloud. For example, if you enter '2' for the minimum frequency, only words that appear at least twice in your text source will be included in your word cloud.

Show frequencies:

Marking 'yes' here will display the actual number of times each word appears in your text source.

Group similar words (English only):

TagCrowd uses the standard Porter Stemming Algorithm to detect and combine similar words. For example, the words 'teachers', 'teaching' and 'teach' will all be combined so your word cloud is less redundant. The most frequent of the variants is chosen to represent them all. In the case of a tie, the shorter variant is used.

Don't show these words:

You may see words in your cloud that are irrelevant. Type those words here to remove them from the cloud.

Can I use TagCrowd for commercial purposes?

TagCrowd word clouds are free to use under a Creative Commons Attribution License. That means you can use them for commercial and non-commercial purposes as long as you attribute TagCrowd.com with a name and a link.

If you find TagCrowd valuable in your business and would like to ensure its continued existence, you can buy the developer a cup of coffee.

Is there a TagCrowd API? Can I use a script?

TagCrowd is a non-commercial project running on a personal server. I cannot afford the server cost of supporting automated scripts or serving word clouds through an external API. Scripts are prohibited by the Terms of Use and will probably crash the server.

Is the data I input confidential?

The text you enter into TagCrowd is transmitted over encrypted SSL, exists on our server for the few milliseconds it takes to generate the word cloud, then immediately deleted. Your data is never shared with anyone. If you generate a PDF, the PDF of your word cloud is deleted after 48 hours; however, the source text is never stored. For further details, see our Privacy Policy.

How do I create an image or PDF of my cloud?

Save your word cloud as PDF by clicking on the 'Save as...' button under the cloud, then choosing PDF. You'll get a download link.

To make an image of the word cloud, take a screenshot. Here are screenshot instructions for Windows. On a Mac, just hit Apple-Shift-4 and drag a box around the cloud you want to save; you'll save a screenshot image to your Desktop. If you use Linux, you probably already know how to do this.

How do I customize the look of my word cloud?

You’ll find a “CUSTOMIZE” section near the top of the HTML Embed code where you can customize some of the CSS styles to suit the style of your webpage. Custom styles include font and font size, overall cloud size, margins, padding, borders and background color.

In the future we’ll introduce controls for changing the color and fonts without having to edit CSS.

You can of course edit the CSS and HTML that lies outside the customize section, although it’s advanced and we can’t provide support for that.

How do I keep multiple words together in the cloud (e.g. 'New York')?

Use a tilde character ~ between words you want to keep together. To do this, run a find & replace on the original text file and insert a ~ (tilde character) between the words you want to group. For example: replace 'New York' with 'New~York', 'word cloud' with 'word~cloud', etc. The resulting cloud will have non-breaking spaces inserted for the tilde.

Why is a certain word not showing up in my cloud? How do I prevent that?

TagCrowd uses language-specific lists of common words to keep word clouds relevant. You can always disable this feature by setting the Language to 'none'. To prevent particular words from being removed, add a ~ (tilde character) to the end of any word you want to preserve.

For example, 'IT' is an acronym for 'information technology', but it's also the common English word 'it'. Replace all occurrences of 'IT' with 'IT~' to keep it in the cloud. Just be careful you're only marking the words you actually want to keep. In this example, don't mark the common word 'it' as well in your text.

Does TagCrowd work for non-English languages?

TagCrowd is Unicode compliant and offers basic support for many languages. Choose the language of your source text in the Options section of the TagCrowd application. "Basic support" currently means languages based on the Latin alphabet (i.e. most of Europe), and all accented characters are converted to plain characters. For example, the characters é, ä, ç become e, a, c. Since this is the first international version of TagCrowd, there will certainly be some bugs. Please let us know if you find any.

TagCrowd can only support languages for which we have a list of common words, known as a 'stop list' or 'stop words'. Currently supported languages include Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Polish, Portuguese, Romanian, Spanish and Swedish. (Click on your language to see the stop words we are using -- be sure to view them with Unicode UTF8 text encoding.)

Let us know if you want to suggest additional stop words for these languages. And if TagCrowd doesn't currently work in your language, please send us a list of common words in your language.

How does the URL text source work?

TagCrowd reads the web document at the address you provide and extracts the text, ignoring the HTML code that surrounds it. Because of the wide variation in web standards compliance, this process isn't 100% reliable. Note: TagCrowd only crawls a single web document, not the whole domain. Also, TagCrowd can't discriminate between content text and extraneous text like navigation menus, so if you want a more focused cloud, try copy/pasting the exact text you want to visualize.

How can I make a cloud of a very long text file?

Upload the file to TagCrowd or link to it wherever it's posted on the web. Only plain text files are accepted (or HTML files if providing a web page URL). The max file size is 5 megabytes. Max for pasting is 500 kilobytes but depends more on your browser and computer than TagCrowd servers.

How do I embed my word cloud in a web page?

After you generate your word cloud, click the 'Save as...' button underneath. Click the option for 'HTML embed' and a box of HTML code will appear. Copy and paste the code into any web page that allows in-line stylesheets. Feel free to modify the colors and font sizes in the stylesheet to customize your cloud, as long as you keep the reference to TagCrowd. You can also add URLs into the links so the words in the cloud link somewhere. This code should work with most blog software -- but not all. If it doesn't, try moving the style information from the code into your blog's external stylesheet. We're working on a way to improve compatibility.

Can I install TagCrowd on my own server or computer?

TagCrowd software only runs on TagCrowd.com. You are free to use our HTML and PDF embedding options, and save screenshots. The application itself is not currently for sale. Referral for video production services.