Home » c# » Parsing SQL code in C# [closed]

Parsing SQL code in C# [closed]

Posted by: admin November 29, 2017 Leave a comment

Questions:

I want to parse SQL code using C#.

Specifically, is there any freely available parser which can parse SQL code and generate a tree or any other structure out of it? It should also generate the proper tree for nested structures.

It should also return which kind of statement the node of this tree represents.

For example, if the node contains a loop condition then it should return that this is a “loop type” of a node.

Or is there any way by which I can parse the code in C# and generate a tree of the type I want?

Answers:

Use Microsoft Entity Framework (EF).

It has a “Entity SQL” parser which builds an expression tree,

using System.Data.EntityClient;
...
EntityConnection conn = new EntityConnection(myContext.Connection.ConnectionString);
conn.Open();
EntityCommand cmd = conn.CreateCommand();
cmd.CommandText = @"Select t.MyValue From MyEntities.MyTable As t";
var queryExpression = cmd.Expression;
....
conn.Close();

Or something like that, check it out on MSDN.

And it’s all on Ballmers tick 🙂

There is also one on The Code Project, SQL Parser.

Good luck.

Questions:
Answers:

Scott Hanselman recently featured the Irony project which includes a sample SQL parser.

Questions:
Answers:

Specifically for Transact-SQL (Microsoft SQL Server) you can use the Microsoft.SqlServer.Management.SqlParser.Parser namespace available in Microsoft.SqlServer.Management.SqlParser.dll, an assembly included with SQL Server and which can be freely distributed.

Here’s an example method for parsing T-SQL as a string into a sequence of tokens:

IEnumerable<TokenInfo> ParseSql(string sql)
{
    ParseOptions parseOptions = new ParseOptions();
    Scanner scanner = new Scanner(parseOptions);

    int state = 0,
        start,
        end,
        lastTokenEnd = -1,
        token;

    bool isPairMatch, isExecAutoParamHelp;

    List<TokenInfo> tokens = new List<TokenInfo>();

    scanner.SetSource(sql, 0);

    while ((token = scanner.GetNext(ref state, out start, out end, out isPairMatch, out isExecAutoParamHelp)) != (int)Tokens.EOF)
    {
        TokenInfo tokenInfo =
            new TokenInfo()
            {
                Start = start,
                End = end,
                IsPairMatch = isPairMatch,
                IsExecAutoParamHelp = isExecAutoParamHelp,
                Sql = sql.Substring(start, end - start + 1),
                Token = (Tokens)token,
            };

        tokens.Add(tokenInfo);

        lastTokenEnd = end;
    }

    return tokens;
}

Note that the TokenInfo class is just a simple class with the above-referenced properties.

Tokens is this enumeration:

and includes constants like TOKEN_BEGIN, TOKEN_COMMIT, TOKEN_EXISTS, etc.

Questions:
Answers:

Try ANTLR – There are a bunch of SQL grammars on there.

Questions:
Answers:

You may take a look at a commerical component: general sql parser at http://www.sqlparser.com
It supports SQL syntax of Oracle, T-SQL, DB2 and MySQL.

Questions:
Answers:

VSTS 2008 Database Edition GDR includes assemblies that handle SQL parsing and script generation that you can reference from your project. Database Edition uses the parser to parse the script files to represent in-memory model of your database and then uses the script generator to generate SQL scripts from the model. I think there are just two assemblies you need to have and reference in your project. If you don’t have the database edition, you may install the trial version to get the assemblies or there might be another way to have them without installing the database edition. Check out the following link.
Data Dude:Getting to the Crown Jewels .

Questions:
Answers:

Try GOLD Parser, it’s a powerful and easy to learn BNF engine. You can search the grammars already made for what you want (ie: SQL ANSI 89 Grammar).

I started using this for HQL parsing (the NHibernate query language, very similar to SQL), and it’s awesome.

UPDATE: Now the NH dev team has done the HQL parsing using ANTLR (which is harder to use, but more powerful AFAIK).

Questions:
Answers:

As Diego suggested, grammars are the way to go IMHO. I’ve tried Coco/r before, but that is too simple for complex SQL. There’s ANTLR with a number of grammars ready.

Someone even tried to build a SQL engine, check the code if there’s something for you in SharpHSQL – An SQL engine written in C#.