Study on Tokenization and Tool Fabrication

EasyChair Preprint 10366

6 pages•Date: June 9, 2023

Singh Shubham Rajan, Tushar Lahaik, Prince Verma, Rishab Kumar, Akash Deep and Battula Naga Prasanth Reddy

Abstract

Tokenization is an important step in compilation, as it enables machines to understand human language more effectively. During tokenization, a lexical analyzer reads the input character by character and groups them into meaningful tokens based on some predefined. These tokens may include keywords, strings, variables, operators, constants, and special symbols. The tokens are then passed on to the next stage of the compilation process. The Tokens Identifier Model is a web-based platform that can assist learners in classifying and identifying different tokens in the C++ and C computer languages. For newcomers and college students with experience in computer applications, the tool seeks to make learning easier. The structure and syntax of these programming languages can be better understood by learners by offering a web platform that accepts input in the form of.cpp and.c files and outputs the categorized tokens in the form of tables. Tokenization is an important step in compilation, as it enables machines to understand human language more effectively. During tokenization, a lexical analyzer reads the input character by character and groups them into meaningful tokens based on some predefined. These tokens may include keywords, strings, variables, operators, constants, and special symbols. The tokens are then passed on to the next stage of the compilation process. The Tokens Identifier Model is a web-based platform that can assist learners in classifying and identifying different tokens in the C++ and C computer languages. For newcomers and college students with experience in computer applications, the tool seeks to make learning easier. The structure and syntax of these programming languages can be better understood by learners by offering a web platform that accepts input in the form of.cpp and.c files and outputs the categorized tokens in the form of tables.

Keyphrases: LexicalAnalysis, Tokens, compiler

Links:

https://easychair.org/publications/preprint/R7B3

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:10366,
  author    = {Singh Shubham Rajan and Tushar Lahaik and Prince Verma and Rishab Kumar and Akash Deep and Battula Naga Prasanth Reddy},
  title     = {Study on Tokenization and Tool Fabrication},
  howpublished = {EasyChair Preprint 10366},
  year      = {EasyChair, 2023}}

Download PDF Open PDF in browser