Table of Contents

Namespace gudusoft.gsqlparser

Classes

EJoinHintHelper
TBaseType

Constants and functions used by parser internally.

TComponent

Placeholder

TCustomLexer

Base lexer of all databases.

TCustomParser

Base parser of all databases.

TCustomSqlStatement

TCustomSqlStatement is the root class for all SQL statements.

TGSqlParser

This is the class where you start to use this SQL parser library.

create a sql parser by specify a database vendor, then set sql script or a file including sql script need to be processed, then call one of the following methods to achieve what you need:

  • parse(), check syntax of input SQL, do a in-depth analysis of input SQL such as creating relationship between table and columns. SQL query parse tree nodes are available after calling parse method and return without any syntax errors.

    Syntax error in one SQL statement in SQL scripts separated by statement delimiter such as ; doesn't affects others, but failed to parse a SQL statement insides a stored procedure will leads all SQL statements inside this stored procedure un-parsed which means no parse tree nodes will be available for this stored procedure.

    If syntax error detected, please check ErrorCount and Errormessage. SyntaxErrors include a list of syntax error object.

  • checkSyntax(), same as parse()
  • tokenizeSqltext(), Turn input sql into a list of tokens which is the basic lexis element of SQL syntax.Token is categorized as keyword, identifier, number, operator, whitespace and other types. Fetch source tokens via sourcetokenlist
    Parse tree node is not available
  • getrawsqlstatements(), Separate SQL statement in a script file without doing syntax checking, a list of sql statements which is sub class of TCustomSqlStatement is avaiable after calling this method. Use statement type ESqlStatementType to check type of SQL statement, Use String to get text representation of each SQL statement, no further detailed information about SQL statement was available. You may fetch SQL statements via sqlstatements

Sample code used to format SQL script.

int ret = sqlparser.parse();
 if (ret == 0){
     GFmtOpt option = GFmtOptFactory.newInstance();
     String result = FormatterFactory.pp(sqlparser, option);
     System.out.println(result);
 }else{
     System.out.println(sqlparser.getErrormessage());
 }

Some use cases after paring SQL script: are:

  • Table/column impact analysis
  • SQL rewriting
  • SQL translate between different databases
  • SQL migration analysis
  • Help to anti SQL injection
  • More use cases

Typically, SQL parse tree nodes generated by this SQL Parser were closely related to SQL elements defined in database vendor's SQL reference book. here is a brief summary of some most used SQL elements and corresponding classes defined in this SQL parser.

Some major SQL statements:

Stored procedure

TLexerMssql
TSourceToken

Represents a source token which is the basic syntactical unit of SQL. All parse tree nodes including one or more tokens.
A list of source token will be available after parse or tokenize the input SQL.

  TGSqlParser sqlparser = new TGSqlParser(EDbVendor.dbvoracle);
  sqlparser.sqltext = "select col from t";
  int ret = sqlparser.parse();
  // this will geneate a list of tokens only, no parse tree node is available.
  // int ret = sqlparser.tokenizeSqltext();
  if (ret == 0){
   for(int i=0;i<sqlparser.sourcetokenlist.size();i++){
     TSourceToken st =  sqlparser.sourcetokenlist.get(i);
     System.out.println(st.tokentype.toString()+" "+st.toString());
   }
  }else{
    System.out.println(sqlparser.getErrormessage());
  }

}

Get a list of source tokens after call parse() or just call tokenizeSqltext() if you only need to access tokens of input SQL without generating full parse tree nodes.

tokencode is the unique id represents one kind of token. The most important type of whitespace, return, and tokentype

TSourceTokenList

List of source token.

TStatementList

List of SQL statements which is instance of class that descends from TCustomSqlStatement.

TSyntaxError

Detailed information about syntax error.

TVisitedAbs

Base class of all parse tree nodes that can be visited by visitors.

TVisitorAbs

Base class of all visitors that used to visit parse tree nodes.

Interfaces

IMetaDatabase

In order to link column to table correctly without connecting to database, we need to provide a class which implements IMetaDatabase to TGSqlParser. this class tells TGSqlParser the relation ship between column and table.

Take this SQL for example:

SELECT Quantity,b.Time,c.Description
FROM
(SELECT ID2,Time FROM bTab) b
INNER JOIN aTab a on a.ID=b.ID
INNER JOIN cTab c on a.ID=c.ID

General SQL Parser can build relationship between column: ID2 and table: bTable correctly without metadata information from database because there is only one table in from clause. But it can't judge column: Quantity belong to table: aTab or cTab, since no table alias was prefixed to column: Quantity. If no metadata provided, General SQL Parser will link column: Quantity to the first valid table (here it is aTab)

If we create a class metaDB implements IMetaDatabase,then TGSqlParser.setMetaDatabase(new metaDB()), General SQL Parser can take this advantage to create a correct relationship between column and tables. Here is a sample of metaDB, you should create your own metaDB class with meta information from database.

public class metaDB implements IMetaDatabase {

    String columns[][] = {
        {"dbo","aTab","Quantity1"},
        {"dbo","bTab","Quantity2"},
        {"dbo","cTab","Quantity"}
    };

    public boolean checkColumn(String schema, String table, String column){
       boolean bSchema,bTable,bColumn,bRet = false;
        for (int i=0; i<columns.length;i++){
            if (schema == null){
                bSchema = true;
            }else{
                bSchema = columns[i][0].equalsIgnoreCase(schema);
            }

            if (!bSchema) continue;

            bTable = columns[i][1].equalsIgnoreCase(table);
            if (!bTable) continue;

            bColumn = columns[i][2].equalsIgnoreCase(column);
            if (!bColumn) continue;

            bRet =true;
            break;

        }

        return bRet;
    }

}

After TGSqlParser.setMetaDatabase(new metaDB()), General SQL parser can generate this result for you:

atab.id
btab.id
btab.time
ctab.description
ctab.id
ctab.quantity
ITokenHandle

use this interface if you like to change token properties generated by lexer before send to parser.

package test.interfaceDemo;

import gudusoft.gsqlparser.*;
import junit.framework.TestCase;

class myTokenHandle implements ITokenHandle{
    public boolean processToken(TSourceToken st){

        if (st.toString().equalsIgnoreCase("limit")){
            st.tokencode = TBaseType.cmtslashstar;//treat this token as a comment
        }
           return true;
    }
}

public class testITokenHandle extends TestCase {

    public void test1(){
        TGSqlParser sqlparser = new TGSqlParser(EDbVendor.dbvmysql);
        sqlparser.sqltext = "select * from dual limit";
        sqlparser.setTokenHandle(new myTokenHandle());

        assertTrue(sqlparser.parse() == 0);
    }

}

Enums

EActionOnParseError

Used by parser internally.

EAggregateType
EAlterColumnSubType
EAlterIndexOption
EAlterTableOptionType
EAlterTriggerOption
EAlterViewOption
EBoundaryType
EColumnAttributeType
EColumnFormat

MySQL column format attribute

EColumnStorage

MySQL column storage type

EComparisonType
EConstraintType
ECreateTableOption
EDataType

@since v1.4.3.0

EDataTypeAttribute

Teradata datatype attribute

EDbObjectType
EDbVendor

Database vendors supported:
dbvdb2,dbvgreenplum,dbvhive,dbvimpala, dbvinformix,dbvmdx,dbvmssql,dbvmysql, dbvnetezza,dbvoracle,dbvpostgresql,dbvredshift, dbvsybase, dbvteradata

EDeclareType
EErrorType

Type of SQL syntax errors.

EExecuteAsOption

SQL Server execute as option

EExpressionType


scalar expression: return a single value.
primary expressions: are the building blocks of more complex expressions. they are literals, column reference, function call, parenthesis expr, scalar subquery, case expr and etc.
binary expression: has two scalar expression.
unary expression: has one scalar expression.
boolean expression: evaluates to a value of the Boolean Data Type: True or False.



SQL Server

binary expression type: arithmetic_plus_t,arithmetic_minus_t,arithmetic_times_t, arithmetic_divide_t,arithmetic_modulo_t, bitwise_and_t,bitwise_or_t ,bitwise_xor_t



boolean expression:
exists_t,pattern_matching_t, in_t,



boolean comparison type: boolean_comparison_equal_t,boolean_comparison_greaterThan_t, boolean_comparison_lessThan_t, boolean_comparison_greaterThanOrEqualTo_t, boolean_comparison_lessThanOrEqualTo_t, boolean_comparison_notEqualToBrackets_t, boolean_comparison_notEqualToExclamation_t, boolean_comparison_notLessThan_t, boolean_comparison_notGreaterThan_t, boolean_comparison_leftOuterJoin_t, boolean_comparison_rightOuterJoin_t,

EFindSqlStateType

Used by parser internally.

EFunctionType
EGeneratedColumnStoreType

MySQL generated column store type

EIndexOptionType
EIndexType
EInsertSource

source of insert value

EIsolationLevel
EJoinHint

SQL Server Join hint

EJoinType
EKeyActionType
EKeyReferenceType
ELimitRowType
ELiteralType

SQL Server literal type:
integer_et,real_et,money_et,binary_et, string_et,null_et,default_et, max_et,odbc_et,numeric_et,

ENullOrder
EOracleCreateType
EParameterMode
EParserFlag

Used by parser internally.

EProcedureOptionType
EQuantifierType
EQuoteType
ERaiseLevel
ESequenceOptionType
ESetOperatorType
ESetScope
ESetStatementType
ESortType

Used to replace TBaseType.srtNone,srtAsc and srtDesc

ESqlClause

Enum for various SQL clause such as where clause, having clause and etc.

ESqlStatementType

Type of SQL statement.



SQL statement available in more than one databases:
sstunknown,
sstselect, sstdelete, sstupdate,sstinsert,
sstcreatetable, sstcreateview,sstcreatedatabase, sstcreateindex,sstcreatesequence, sstcreatesynonym, sstcreatematerializedview,
sstaltertable,sstalterdatabase, sstalterindex, sstaltersession sstaltertrigger, sstalterview,
sstdropindex,sstdropschema,sstdropsequence,sstdropsynonym, sstdroptable,sstdropview,
sstmerge
sstTruncate,
sst_block,sst_block_with_declare_and_parameters,sst_block_with_exception,,sst_block_with_label,
sst_assignstmt, sst_casestmt, sst_closestmt sst_cursordecl,sstdescribe, sst_elsifstmt, sst_ifstmt,sst_loopstmt,sst_openforstmt, sst_openstmt,sstl_raisestmt,sst_returnstmt

<xref href="gudusoft.gsqlparser.ESqlStatementType.sstGrant" data-throw-if-not-resolved="false"></xref>,<xref href="gudusoft.gsqlparser.ESqlStatementType.sstRevoke" data-throw-if-not-resolved="false"></xref>
<xref href="gudusoft.gsqlparser.ESqlStatementType.sstrename" data-throw-if-not-resolved="false"></xref>,

<xref href="gudusoft.gsqlparser.ESqlStatementType.sstUseDatabase" data-throw-if-not-resolved="false"></xref>



<xref href="gudusoft.gsqlparser.ESqlStatementType.sstExecutePreparedStmt" data-throw-if-not-resolved="false"></xref>, <xref href="gudusoft.gsqlparser.ESqlStatementType.sst_exitstmt" data-throw-if-not-resolved="false"></xref>,

<xref href="gudusoft.gsqlparser.ESqlStatementType.sstExplain" data-throw-if-not-resolved="false"></xref>,
<xref href="gudusoft.gsqlparser.ESqlStatementType.sst_fetchstmt" data-throw-if-not-resolved="false"></xref>



DB2 SQL statement:
sstdb2callstmt,sstdb2casestmt,sstdb2closecursorstmt, sstdb2conditiondeclaration,sstdb2createfunction, sstdb2createprocedure,sstdb2createtrigger sstdb2declarecursorstatement,sstdb2dummystmt, sstdb2dynamiccompoundstatement,sstdb2fetchcursorstmt,sstdb2forstmt, sstdb2ifstmt,sstdb2iteratestmt,sstdb2leavestmt,sstdb2signalstatement, sstdb2whilestmt,sstdb2repeatstmt,sstdb2opencursorstmt,sstdb2gotostmt, sstdb2loopstmt,sstdb2procedurecompoundstatement,sstdb2returnstmt, sstdb2valuesinto,sstdb2stmtstub



Hive SQL statement:
ssthiveExplain,ssthiveLoad,ssthiveExportTable,ssthiveImportTable, ssthiveSwitchDatabase,ssthiveDropDatabase,ssthiveDesc,ssthiveShow, ssthiveMetaStoreCheck,ssthiveAnalyzeTable,ssthiveLockTable,ssthiveUnlockTable, ssthiveCreateRole,ssthiveDropRole,ssthiveGrant,ssthiveRevoke,ssthiveShowGrants, ssthiveShowRoleGrants,ssthiveGrantRole,ssthiveRevokeRole,ssthiveCreateFunction, ssthiveDropFunciton,ssthiveFromQuery,ssthiveSet



Informix SQL statement:
sstinformixAllocateCollection,sstinformixAllocateDescriptor,sstinformixAllocateRow, sstinformixAlterAccess_Method,sstinformixCreateFunction,sstinformixCreateProcedure, sstinformixCreateRowType,sstinformixCreateTrigger,sstinformixDropRowType, sstinformixExecute,sstinformixExecuteFunction,sstinformixExecuteImmediate, sstinformixExecuteProcedure



MDX SQL statement:
sstmdxselect,sstmdxupdate,sstmdxcall,sstmdxclearcalculations, sstmdxdrillthrough,sstmdxaltercube,sstmdxcreateaction,sstmdxcreatecellcalculation, sstmdxcreateglobalcube,sstmdxcreatemember,sstmdxcreatesessioncube,sstmdxcreateset, sstmdxcreatesubcube,sstmdxdropaction,sstmdxdropcellcalculation,sstmdxdropmember, sstmdxdropset,sstmdxdropsubcube,sstmdxrefreshcube,sstmdxcalculate,sstmdxcase, sstmdxfreeze,sstmdxscope,sstmdxexpression



SQL Server SQL statement:
sstmssqlbegindialog,sstmssqlbegintran,sstmssqlblock,sstmssqlbreak, sstmssqlbulkinsert,sstmssqlclose,sstmssqlcommit,sstmssqlcontinue,sstmssqlcreatefunction, sstmssqlcreateprocedure,sstmssqlcreatetrigger,sstmssqldeallocate,sstmssqldeclare, sstmssqldropdbobject,sstmssqldummystmt,sstmssqlendconversation,sstmssqlexec, sstmssqlexecuteas,sstmssqlfetch,sstmssqlgo,sstmssqlgoto,sstmssqlgrant, sstmssqlif,sstmssqllabel,sstmssqlopen,sstmssqlprint,sstraiserror,sstmssqlreturn, sstmssqlrevert,sstmssqlrollback,sstmssqlsavetran,sstmssqlsendonconversation, sstmssqlset,sstmssqlsetrowcount,sstmssqlstmtstub,sstmssqlthrow,sstmssqlupdatestatistics, sstmssqlupdatetext,sstmssqlwaitfor



MySQL SQL statement:
sstmysqlblock,sstmysqlcall,sstmysqlcasestmt,sstmysqlcreatefunction, sstmysqlcreateprocedure,sstmysqlcreatetrigger,sstmysqldeclare,sstmysqlfetchcursor, sstmysqlifstmt,sstmysqlloopstmt,sstmysqlopencursor,sstmysqlprepare,sstmysqlrepeatstmt, sstmysqlreturn, sstmysqlset, sstmysqlsignal,sstmysqlstmtstub,sstmysqlwhilestmt



Netezza SQL statement:
sstnetezzaGenerateStatistics,sstnetezzaGroomTable



Oracle SQL statement:
sstplsql_procbasicstmt, sstcall,sstoraclecomment, sstoraclecreatedirectory,sstoraclecreatelibrary,sstoracleexecuteprocedure,ssstplsqlContinue, sstplsql_createfunction,sstplsql_createpackage,sstplsql_createprocedure,sstplsql_createtrigger, sstplsql_createtype,sstplsql_createtypebody,sstplsql_dummystmt,sstplsql_execimmestmt, sstplsql_forallstmt,sstplsql_gotostmt,sstplsql_nullstmt,sstplsql_piperowstmt,sstplsql_pragmadecl, sstplsql_procedurespec,sstplsql_recordtypedef,sstplsql_tabletypedef,sstplsql_varraytypedef, sstsqlpluscmd



PostgreSQL SQL statement:
sstpostgresqlDropRole,sstpostgresqlDropTrigger,sstpostgresqlExecute, sstpostgresqlforeach,sstpostgresqlgetdiag,sstpostgresqlMove,sstplsql_nullstmt, sstpostgresqlCreateTrigger,sstpostgresqlCreateFunction



Redshift SQL statement:
sstFetchFrom,sstredshiftAbort,sstredshiftAlterGroup,sstredshiftAlterSchema, sstredshiftAlterUser,sstredshiftAnalyze,sstredshiftAnalyzeCompression, sstredshiftBegin,sstredshiftCancel,sstredshiftClose,sstredshiftComment, sstredshiftCommit,sstredshiftCopy,sstredshiftCreateGroup,sstredshiftCreateSchema. sstredshiftCreateUser,sstredshiftDeallocate,sstredshiftDeclare,sstredshiftDropDatabase, sstredshiftDropGroup,sstredshiftDropSchema,sstredshiftDropUser,sstredshiftEnd, sstExplain,sstredshiftLock,sstredshiftPrepare,sstredshiftReset, sstredshiftRollback,sstredshiftSetSessionAuthorization,sstredshiftSet, sstredshiftShow,sstredshiftStartTransaction,sstredshiftUnload,sstredshiftVacuum



Sybase SQL statement:
sstsybaseInsertBulk,sstsybaseDeleteStatistics,sstsybasedumpTran, sstsybaseupdateindexstatistics,sstsybaseWritetext



Teradata SQL statement:
sstteradataabort,sstteradatabeginlogging,sstteradatabegintransaction, sstteradataBTEQ,sstteradatacollectstatistics,sstteradatacommit,sstteradatacreatefunction, sstteradatacreateprocedure,sstteradatacreatetrigger,sstteradatadropdbobject, sstteradataendlogging,sstteradataendtransaction,sstteradataexecute, sstteradatagive,sstteradatagrant,sstteradatalock,sstteradatarollback, sstteradatasetrole,sstteradatasetsession,sstteradatasettimezone,sstteradatausing

ETableEffectType
ETableOption

MySQL create table option

ETableSource
ETokenStatus

Used by parser internally.

ETokenType

Type of source token.

ETransactionIsolationLevel
ETriggerDmlType
ETriggerTimingPoint
EUniqueRowFilterType
EWindowExcludeType