Martin Normark's blog

Posted on by Martin Normark


*** This post is no longer up-to-date. Take a look at my new post, revisiting how to tranlate text in C# using Google Translate ***

Sometimes, it would be great to be able to translate a text from e.g. English to Danish directly from C#. This could be useful when you want to translate a Resource file into another language.

Google Translate is awesome. There’s also Windows Live Translator, but Microsoft are far behind Google (also) in this game.

Code:

using System;
using System.Net;
using System.Text;
using System.Text.RegularExpressions;

namespace Utilities
{
  public static class Translator
  {
    /// <summary>
    /// Translates the text.
    /// </summary>
    /// <param name="input">The input.</param>
    /// <param name="languagePair">The language pair.</param>
    /// <returns></returns>
    public static string TranslateText(string input, string languagePair)
    {
      return TranslateText(input, languagePair, System.Text.Encoding.UTF7);
    }

    /// <summary>
    /// Translate Text using Google Translate
    /// </summary>
    /// <param name="input">The string you want translated</param>
    /// <param name="languagePair">2 letter Language Pair, delimited by "|". 
    /// e.g. "en|da" language pair means to translate from English to Danish</param>
    /// <param name="encoding">The encoding.</param>
    /// <returns>Translated to String</returns>
    public static string TranslateText(string input, string languagePair, Encoding encoding)
    {
      string url = String.Format("http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}", input, languagePair);

      string result = String.Empty;

      using (WebClient webClient = new WebClient())
      {
        webClient.Encoding = encoding;
        result = webClient.DownloadString(url);
      }

      Match m = Regex.Match(result, "(?<=<div id=result_box dir=\"ltr\">)(.*?)(?=</div>)");

      if (m.Success)
        result = m.Value;

      return result;
    }
  }
}

The translated string is fetched by the RegEx close to the bottom. This could of course change, and you have to keep it up to date.

Professional translations - even via API!
translation agency

About the author

Martin Normark Martin Normark works as a freelance web developer (consultant). He blogs about web, software and programming experiments, daily code battles, specific How To posts and what else comes to mind.

Posted on by Martin Normark | Posted in C# | Tagged

  • http://martinnormark.com Martin H. Normark

    Have you tried to pass in "en|ar" as the languagePair?

  • Daxii

    How to translate a formatted html text, like the google translate when you type a full url?

  • Lennart Øster

    It seems that google has changed its layout of the result page. I choce to use the HtmlAgilityPack, which makes it much easier to handle those changes.

    public static string Translate(string input, string languagePair, Encoding encoding)
    {
    string url = String.Format("http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}", input, languagePair);

    string result = String.Empty;

    using (WebClient webClient = new WebClient())
    {
    webClient.Encoding = encoding;
    result = webClient.DownloadString(url);
    }

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(result);
    return doc.DocumentNode.SelectSingleNode("//textarea[@name='utrans']").InnerText;
    }

    Get the HtmlAgilityPack here: http://www.codeplex.com/htmlagilitypack

    I hope this will help the translation to keep working for everyone

  • bob

    I am trying to user this code in c# 2008 express on Window XP.
    I have modified the line below
    //Match m = Regex.Match(result, "(?<=<div id=result_box dir=\"ltr\">)(.*?)(?=</>)");
    to
    Match m = Regex.Match(result, "(?<=overflow:auto\">)(.*?)(?<=</textarea>)");
    This is returning the text but the translation pair is not working as I expected. If I pass in "HELLO MY FRIEND" "en|sp" the result is "HELLO MY FRIEND</textarea>". If I pass in "HOLA MI AMIGO" "sp|en" I get "HELLO MY FRIEND</textarea>".
    If I passin "HELLO MY FRIEND" "en|da" the result is "Hello my friend</textarea>". A different font proper case but no translation. Can you throw any lighy on this problem?
    Thanks in advance
    Bob pointon

    Hello my friend</textarea>

  • http://martinnormark.com Martin H. Normark

    Hi Bob

    Take a look at the comment above yours. Lennart points out, that Google may have changed their HTML markup lately – which have broken my code.

    Hope you’ll get it working.

  • Sunny

    hi martin,

    webClient.Encoding and webClient.DownloadString is not available in .NET 2003.
    is there any workaround to make it work in NET 2003. it would be of gr8 help.

    thanks,
    sunny

  • http://martinnormark.com Martin H. Normark

    Hi Sunny

    You should be able to use the DownloadData method. Take a look at the MSDN documentation here: http://msdn.microsoft.com/en-us/library/system.net.webclient.downloaddata(v=VS.71).aspx

    There’s a code example down the page, showing you how to transform the byte array you get back into a string.

  • Pingback: Translate text in C#, using Google Translate, revisited | Martin Normark's blog

  • Noobguest

    The following regex:Match m = Regex.Match(result, “(?<=)(.*?)(?=)”)won’t work anymore…Since I don’t know how to modify it, I’ve come with a “dirty” solution that seems to work.. for now..string url = String.Format(“http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}”, input, languagePair); string result = String.Empty; using (WebClient webClient = new WebClient()) { webClient.Encoding = encoding; result = webClient.DownloadString(url); } return result.Substring(result.IndexOf(“onmouseover=”this.style.backgroundColor=’#ebeff9′” onmouseout=”this.style.backgroundColor=’#fff’”>”) + 98, result.IndexOf(“”) + 98));

  • http://www.milkshakecommerce.com/ecommerce-blog Martin H. Normark

    Hi

    Please take a look at my updated blog post on Google Translate: http://martinnormark.com/translate-text-in-c-using-google-translate-revisited

    It uses the official API instead of scraping the HTML.