Wednesday, December 12, 2018

Source Code Annotation – What is it and Why You Need It


Text annotation is not just about labeling words and finding them later! The scope of an annotation tool has expanded considerably in the last 5 or 6 years. While we can attribute some of this popularity to the advances in digital technologies such as machine learning (which is being integrated with text annotation to make the entire process faster and easier), the digitization of organizational operations has also made annotation popular by expanding the opportunities where text annotation can be integrated.

One such field is cyber security!

Text annotation tools have found a niche in the security arsenal of organizations tightening their safety standards. Annotation tools are being used to check and safeguard the source code of the IT infrastructure!

What is Source Code Annotation?

Developers and coders across cyber space have a tendency to look for answers from their peers when they get stuck on a problem, and online forums are a great place to find help. However, there are times a malicious line/snippet of code can be inadvertently copied from these forums – exposing the entire organization (and its customers) to a possible cyber attack.

With a text annotation tool, developers and organizations can search for bad code or snippets in large amounts of source code – something that is almost impossible to do manually!

How does it Work?

Users can upload a dictionary of pre-determined lines/snippets of code onto the interface,and as they upload new batches of source code text, the tool automatically finds and highlights the bad code if it exists. Overtime the dictionary can be edited and expanded to include more and more code samples, making the system tougher.

With an annotation tool, bad/malicious code can be found and dealt with quickly, and this is exactly what users want - to catch problems before they blow up! The simplicity and speed of text annotation tools have ensured that source code annotation will soon be adopted as a standard part of cyber security measures.

Sunday, December 2, 2018

Artificial Intelligence in Text Annotation? How Does That Work?


As you are reading this, we would assume you have already used a text annotation tool at some point! Without a doubt, text annotation solutions have become the backbone of a variety of research and data analysis work.

It would not be an exaggeration to say that it is impossible to conceive of any other way in which people could find and analyze information, especially in this digital age.

Text annotation tools started as simple, open-source lines of code that made it possible for researchers to label and cross-reference words and sentences so that they could find this information quickly later– of course all this was done manually and was pretty time-consuming.

Enter Artificial Intelligence!

When machine learning started gaining momentum a few years ago, some developers integrated the two concepts – spurred by the requirements of the B2B market; small start-ups invested time and money to create user-friendly text annotation tools that addressed specific issues businesses face everyday.

How Does Built-In ML Work in a Text Annotation Tool?

Manual annotation requires that the person reads through the entire text and tags the words individually. This is a good tool work flow if you want to access the same document later and quickly find specific words or relations, but it is tedious, and the quality of the work is entirely dependent on the person annotating the text.

With integrated machine learning this work is made automatic! As manual annotation takes place,an AI model gets trained alongside – for example if you are looking for the word “Munich Beijing”in a document, you start annotating manually; and  after the first few are annotated, the ML takes over and highlights all the other “Beijing” Munich in the text. You can also train the model to annotate variations such as different names of the city like Muenchen “Peking”, “Beiping”, or “北京”, or even places within the city boundaries such as Marienplatz “Tian'anmen“, “Beijing CBD Tiergarten“, or “Chaoyang”, etc. As you check the document and make corrections, the AI also becomes smarter and more accurate. Simple Easy and Fast!

Automatic text annotation has not only taken the pain out of annotation but also opened many new areas of usage. What was earlier seen as a tool just for researchers is now being integrated heavily into business operations – helping management understand and use the data they possess!