Namespace gudusoft.gsqlparser
Classes
- TBaseType
Constants and functions used by parser internally.
- TComponent
Placeholder
- TCustomLexer
Base lexer of all databases.
- TCustomParser
Base parser of all databases.
- TCustomSqlStatement
TCustomSqlStatement is the root class for all SQL statements.
- TGSqlParser
This is the class where you start to use this SQL parser library.
create a sql parser by specify a database vendor, then set sql script or a file including sql script need to be processed, then call one of the following methods to achieve what you need:
- parse(), check syntax of input SQL,
do a in-depth analysis of input SQL such as creating relationship between table and columns.
SQL query parse tree nodes are available after calling parse method and return without
any syntax errors.
Syntax error in one SQL statement in SQL scripts separated by statement delimiter such as ; doesn't affects others, but failed to parse a SQL statement insides a stored procedure will leads all SQL statements inside this stored procedure un-parsed which means no parse tree nodes will be available for this stored procedure.
If syntax error detected, please check ErrorCount and Errormessage. SyntaxErrors include a list of syntax error object.
- checkSyntax(), same as parse()
- tokenizeSqltext(), Turn input sql into a list of tokens which is the
basic lexis element of SQL syntax.Token is categorized as keyword, identifier,
number, operator, whitespace and other types. Fetch source tokens via sourcetokenlist
Parse tree node is not available
- getrawsqlstatements(), Separate SQL statement in a script file without doing syntax checking, a list of sql statements which is sub class of TCustomSqlStatement is avaiable after calling this method. Use statement type ESqlStatementType to check type of SQL statement, Use String to get text representation of each SQL statement, no further detailed information about SQL statement was available. You may fetch SQL statements via sqlstatements
Sample code used to format SQL script.
int ret = sqlparser.parse(); if (ret == 0){ GFmtOpt option = GFmtOptFactory.newInstance(); String result = FormatterFactory.pp(sqlparser, option); System.out.println(result); }else{ System.out.println(sqlparser.getErrormessage()); }Some use cases after paring SQL script: are:
- Table/column impact analysis
- SQL rewriting
- SQL translate between different databases
- SQL migration analysis
- Help to anti SQL injection
- More use cases
Typically, SQL parse tree nodes generated by this SQL Parser were closely related to SQL elements defined in database vendor's SQL reference book. here is a brief summary of some most used SQL elements and corresponding classes defined in this SQL parser.
- SQL identifier: TObjectName
- SQL literal: TConstant
- SQL datatype: TTypeName
- SQL function: TFunctionCall
- SQL constraint: TConstraint
- SQL expression/condition: TExpression
- SQL select list item: TResultColumn
- More: gudusoft.gsqlparser.nodes
Some major SQL statements:
- Select: TSelectSqlStatement
- Delete: TDeleteSqlStatement
- Insert: TInsertSqlStatement
- Update: TUpdateSqlStatement
- Create table: TCreateTableSqlStatement
- More: gudusoft.gsqlparser.stmt
Stored procedure
- Create function: TDb2CreateFunction, TMssqlCreateProcedure, TMySQLCreateFunction, TPlsqlCreateFunction
- Create procedure: TDb2CreateProcedure, TMssqlCreateProcedure, TMySQLCreateProcedure, TPlsqlCreateProcedure
- Create trigger: TDb2CreateTrigger, TMssqlCreateTrigger, TMySQLCreateTrigger, TPlsqlCreateTrigger
- Create package: TPlsqlCreatePackage
- parse(), check syntax of input SQL,
do a in-depth analysis of input SQL such as creating relationship between table and columns.
SQL query parse tree nodes are available after calling parse method and return without
any syntax errors.
- TSourceToken
Represents a source token which is the basic syntactical unit of SQL. All parse tree nodes including one or more tokens.
A list of source token will be available after parse or tokenize the input SQL.TGSqlParser sqlparser = new TGSqlParser(EDbVendor.dbvoracle); sqlparser.sqltext = "select col from t"; int ret = sqlparser.parse(); // this will geneate a list of tokens only, no parse tree node is available. // int ret = sqlparser.tokenizeSqltext(); if (ret == 0){ for(int i=0;i<sqlparser.sourcetokenlist.size();i++){ TSourceToken st = sqlparser.sourcetokenlist.get(i); System.out.println(st.tokentype.toString()+" "+st.toString()); } }else{ System.out.println(sqlparser.getErrormessage()); } }Get a list of source tokens after call parse() or just call tokenizeSqltext() if you only need to access tokens of input SQL without generating full parse tree nodes.
tokencode is the unique id represents one kind of token. The most important type of whitespace, return, and tokentype
- TSourceTokenList
List of source token.
- TStatementList
List of SQL statements which is instance of class that descends from TCustomSqlStatement.
- TSyntaxError
Detailed information about syntax error.
- TVisitedAbs
Base class of all parse tree nodes that can be visited by visitors.
- TVisitorAbs
Base class of all visitors that used to visit parse tree nodes.
Interfaces
- IMetaDatabase
In order to link column to table correctly without connecting to database, we need to provide a class which implements IMetaDatabase to TGSqlParser. this class tells TGSqlParser the relation ship between column and table.
Take this SQL for example:
SELECT Quantity,b.Time,c.Description FROM (SELECT ID2,Time FROM bTab) b INNER JOIN aTab a on a.ID=b.ID INNER JOIN cTab c on a.ID=c.IDGeneral SQL Parser can build relationship between column: ID2 and table: bTable correctly without metadata information from database because there is only one table in from clause. But it can't judge column: Quantity belong to table: aTab or cTab, since no table alias was prefixed to column: Quantity. If no metadata provided, General SQL Parser will link column: Quantity to the first valid table (here it is aTab)
If we create a class metaDB implements IMetaDatabase,then TGSqlParser.setMetaDatabase(new metaDB()), General SQL Parser can take this advantage to create a correct relationship between column and tables. Here is a sample of metaDB, you should create your own metaDB class with meta information from database.
public class metaDB implements IMetaDatabase { String columns[][] = { {"dbo","aTab","Quantity1"}, {"dbo","bTab","Quantity2"}, {"dbo","cTab","Quantity"} }; public boolean checkColumn(String schema, String table, String column){ boolean bSchema,bTable,bColumn,bRet = false; for (int i=0; i<columns.length;i++){ if (schema == null){ bSchema = true; }else{ bSchema = columns[i][0].equalsIgnoreCase(schema); } if (!bSchema) continue; bTable = columns[i][1].equalsIgnoreCase(table); if (!bTable) continue; bColumn = columns[i][2].equalsIgnoreCase(column); if (!bColumn) continue; bRet =true; break; } return bRet; } }After TGSqlParser.setMetaDatabase(new metaDB()), General SQL parser can generate this result for you:
atab.id btab.id btab.time ctab.description ctab.id ctab.quantity
- ITokenHandle
use this interface if you like to change token properties generated by lexer before send to parser.
package test.interfaceDemo; import gudusoft.gsqlparser.*; import junit.framework.TestCase; class myTokenHandle implements ITokenHandle{ public boolean processToken(TSourceToken st){ if (st.toString().equalsIgnoreCase("limit")){ st.tokencode = TBaseType.cmtslashstar;//treat this token as a comment } return true; } } public class testITokenHandle extends TestCase { public void test1(){ TGSqlParser sqlparser = new TGSqlParser(EDbVendor.dbvmysql); sqlparser.sqltext = "select * from dual limit"; sqlparser.setTokenHandle(new myTokenHandle()); assertTrue(sqlparser.parse() == 0); } }
Enums
- EActionOnParseError
Used by parser internally.
- EColumnFormat
MySQL column format attribute
- EColumnStorage
MySQL column storage type
- EDataType
@since v1.4.3.0
- EDataTypeAttribute
Teradata datatype attribute
- EDbVendor
Database vendors supported:
dbvdb2,dbvgreenplum,dbvhive,dbvimpala, dbvinformix,dbvmdx,dbvmssql,dbvmysql, dbvnetezza,dbvoracle,dbvpostgresql,dbvredshift, dbvsybase, dbvteradata
- EErrorType
Type of SQL syntax errors.
- EExecuteAsOption
SQL Server execute as option
- EExpressionType
scalar expression: return a single value.
primary expressions: are the building blocks of more complex expressions. they are literals, column reference, function call, parenthesis expr, scalar subquery, case expr and etc.
binary expression: has two scalar expression.
unary expression: has one scalar expression.
boolean expression: evaluates to a value of the Boolean Data Type: True or False.
SQL Server
binary expression type: arithmetic_plus_t,arithmetic_minus_t,arithmetic_times_t, arithmetic_divide_t,arithmetic_modulo_t, bitwise_and_t,bitwise_or_t ,bitwise_xor_t
boolean expression:
exists_t,pattern_matching_t, in_t,
boolean comparison type: boolean_comparison_equal_t,boolean_comparison_greaterThan_t, boolean_comparison_lessThan_t, boolean_comparison_greaterThanOrEqualTo_t, boolean_comparison_lessThanOrEqualTo_t, boolean_comparison_notEqualToBrackets_t, boolean_comparison_notEqualToExclamation_t, boolean_comparison_notLessThan_t, boolean_comparison_notGreaterThan_t, boolean_comparison_leftOuterJoin_t, boolean_comparison_rightOuterJoin_t,
- EFindSqlStateType
Used by parser internally.
- EGeneratedColumnStoreType
MySQL generated column store type
- EInsertSource
source of insert value
- EJoinHint
SQL Server Join hint
- ELiteralType
SQL Server literal type:
integer_et,real_et,money_et,binary_et, string_et,null_et,default_et, max_et,odbc_et,numeric_et,
- EParserFlag
Used by parser internally.
- ESortType
Used to replace TBaseType.srtNone,srtAsc and srtDesc
- ESqlClause
Enum for various SQL clause such as where clause, having clause and etc.
- ESqlStatementType
Type of SQL statement.
SQL statement available in more than one databases:
sstunknown,
sstselect, sstdelete, sstupdate,sstinsert,
sstcreatetable, sstcreateview,sstcreatedatabase, sstcreateindex,sstcreatesequence, sstcreatesynonym, sstcreatematerializedview,
sstaltertable,sstalterdatabase, sstalterindex, sstaltersession sstaltertrigger, sstalterview,
sstdropindex,sstdropschema,sstdropsequence,sstdropsynonym, sstdroptable,sstdropview,
sstmerge
sstTruncate,
sst_block,sst_block_with_declare_and_parameters,sst_block_with_exception,,sst_block_with_label,
sst_assignstmt, sst_casestmt, sst_closestmt sst_cursordecl,sstdescribe, sst_elsifstmt, sst_ifstmt,sst_loopstmt,sst_openforstmt, sst_openstmt,sstl_raisestmt,sst_returnstmt<xref href="gudusoft.gsqlparser.ESqlStatementType.sstGrant" data-throw-if-not-resolved="false"></xref>,<xref href="gudusoft.gsqlparser.ESqlStatementType.sstRevoke" data-throw-if-not-resolved="false"></xref> <xref href="gudusoft.gsqlparser.ESqlStatementType.sstrename" data-throw-if-not-resolved="false"></xref>, <xref href="gudusoft.gsqlparser.ESqlStatementType.sstUseDatabase" data-throw-if-not-resolved="false"></xref> <xref href="gudusoft.gsqlparser.ESqlStatementType.sstExecutePreparedStmt" data-throw-if-not-resolved="false"></xref>, <xref href="gudusoft.gsqlparser.ESqlStatementType.sst_exitstmt" data-throw-if-not-resolved="false"></xref>, <xref href="gudusoft.gsqlparser.ESqlStatementType.sstExplain" data-throw-if-not-resolved="false"></xref>, <xref href="gudusoft.gsqlparser.ESqlStatementType.sst_fetchstmt" data-throw-if-not-resolved="false"></xref>
DB2 SQL statement:
sstdb2callstmt,sstdb2casestmt,sstdb2closecursorstmt, sstdb2conditiondeclaration,sstdb2createfunction, sstdb2createprocedure,sstdb2createtrigger sstdb2declarecursorstatement,sstdb2dummystmt, sstdb2dynamiccompoundstatement,sstdb2fetchcursorstmt,sstdb2forstmt, sstdb2ifstmt,sstdb2iteratestmt,sstdb2leavestmt,sstdb2signalstatement, sstdb2whilestmt,sstdb2repeatstmt,sstdb2opencursorstmt,sstdb2gotostmt, sstdb2loopstmt,sstdb2procedurecompoundstatement,sstdb2returnstmt, sstdb2valuesinto,sstdb2stmtstub
Hive SQL statement:
ssthiveExplain,ssthiveLoad,ssthiveExportTable,ssthiveImportTable, ssthiveSwitchDatabase,ssthiveDropDatabase,ssthiveDesc,ssthiveShow, ssthiveMetaStoreCheck,ssthiveAnalyzeTable,ssthiveLockTable,ssthiveUnlockTable, ssthiveCreateRole,ssthiveDropRole,ssthiveGrant,ssthiveRevoke,ssthiveShowGrants, ssthiveShowRoleGrants,ssthiveGrantRole,ssthiveRevokeRole,ssthiveCreateFunction, ssthiveDropFunciton,ssthiveFromQuery,ssthiveSet
Informix SQL statement:
sstinformixAllocateCollection,sstinformixAllocateDescriptor,sstinformixAllocateRow, sstinformixAlterAccess_Method,sstinformixCreateFunction,sstinformixCreateProcedure, sstinformixCreateRowType,sstinformixCreateTrigger,sstinformixDropRowType, sstinformixExecute,sstinformixExecuteFunction,sstinformixExecuteImmediate, sstinformixExecuteProcedure
MDX SQL statement:
sstmdxselect,sstmdxupdate,sstmdxcall,sstmdxclearcalculations, sstmdxdrillthrough,sstmdxaltercube,sstmdxcreateaction,sstmdxcreatecellcalculation, sstmdxcreateglobalcube,sstmdxcreatemember,sstmdxcreatesessioncube,sstmdxcreateset, sstmdxcreatesubcube,sstmdxdropaction,sstmdxdropcellcalculation,sstmdxdropmember, sstmdxdropset,sstmdxdropsubcube,sstmdxrefreshcube,sstmdxcalculate,sstmdxcase, sstmdxfreeze,sstmdxscope,sstmdxexpression
SQL Server SQL statement:
sstmssqlbegindialog,sstmssqlbegintran,sstmssqlblock,sstmssqlbreak, sstmssqlbulkinsert,sstmssqlclose,sstmssqlcommit,sstmssqlcontinue,sstmssqlcreatefunction, sstmssqlcreateprocedure,sstmssqlcreatetrigger,sstmssqldeallocate,sstmssqldeclare, sstmssqldropdbobject,sstmssqldummystmt,sstmssqlendconversation,sstmssqlexec, sstmssqlexecuteas,sstmssqlfetch,sstmssqlgo,sstmssqlgoto,sstmssqlgrant, sstmssqlif,sstmssqllabel,sstmssqlopen,sstmssqlprint,sstraiserror,sstmssqlreturn, sstmssqlrevert,sstmssqlrollback,sstmssqlsavetran,sstmssqlsendonconversation, sstmssqlset,sstmssqlsetrowcount,sstmssqlstmtstub,sstmssqlthrow,sstmssqlupdatestatistics, sstmssqlupdatetext,sstmssqlwaitfor
MySQL SQL statement:
sstmysqlblock,sstmysqlcall,sstmysqlcasestmt,sstmysqlcreatefunction, sstmysqlcreateprocedure,sstmysqlcreatetrigger,sstmysqldeclare,sstmysqlfetchcursor, sstmysqlifstmt,sstmysqlloopstmt,sstmysqlopencursor,sstmysqlprepare,sstmysqlrepeatstmt, sstmysqlreturn, sstmysqlset, sstmysqlsignal,sstmysqlstmtstub,sstmysqlwhilestmt
Netezza SQL statement:
sstnetezzaGenerateStatistics,sstnetezzaGroomTable
Oracle SQL statement:
sstplsql_procbasicstmt, sstcall,sstoraclecomment, sstoraclecreatedirectory,sstoraclecreatelibrary,sstoracleexecuteprocedure,ssstplsqlContinue, sstplsql_createfunction,sstplsql_createpackage,sstplsql_createprocedure,sstplsql_createtrigger, sstplsql_createtype,sstplsql_createtypebody,sstplsql_dummystmt,sstplsql_execimmestmt, sstplsql_forallstmt,sstplsql_gotostmt,sstplsql_nullstmt,sstplsql_piperowstmt,sstplsql_pragmadecl, sstplsql_procedurespec,sstplsql_recordtypedef,sstplsql_tabletypedef,sstplsql_varraytypedef, sstsqlpluscmd
PostgreSQL SQL statement:
sstpostgresqlDropRole,sstpostgresqlDropTrigger,sstpostgresqlExecute, sstpostgresqlforeach,sstpostgresqlgetdiag,sstpostgresqlMove,sstplsql_nullstmt, sstpostgresqlCreateTrigger,sstpostgresqlCreateFunction
Redshift SQL statement:
sstFetchFrom,sstredshiftAbort,sstredshiftAlterGroup,sstredshiftAlterSchema, sstredshiftAlterUser,sstredshiftAnalyze,sstredshiftAnalyzeCompression, sstredshiftBegin,sstredshiftCancel,sstredshiftClose,sstredshiftComment, sstredshiftCommit,sstredshiftCopy,sstredshiftCreateGroup,sstredshiftCreateSchema. sstredshiftCreateUser,sstredshiftDeallocate,sstredshiftDeclare,sstredshiftDropDatabase, sstredshiftDropGroup,sstredshiftDropSchema,sstredshiftDropUser,sstredshiftEnd, sstExplain,sstredshiftLock,sstredshiftPrepare,sstredshiftReset, sstredshiftRollback,sstredshiftSetSessionAuthorization,sstredshiftSet, sstredshiftShow,sstredshiftStartTransaction,sstredshiftUnload,sstredshiftVacuum
Sybase SQL statement:
sstsybaseInsertBulk,sstsybaseDeleteStatistics,sstsybasedumpTran, sstsybaseupdateindexstatistics,sstsybaseWritetext
Teradata SQL statement:
sstteradataabort,sstteradatabeginlogging,sstteradatabegintransaction, sstteradataBTEQ,sstteradatacollectstatistics,sstteradatacommit,sstteradatacreatefunction, sstteradatacreateprocedure,sstteradatacreatetrigger,sstteradatadropdbobject, sstteradataendlogging,sstteradataendtransaction,sstteradataexecute, sstteradatagive,sstteradatagrant,sstteradatalock,sstteradatarollback, sstteradatasetrole,sstteradatasetsession,sstteradatasettimezone,sstteradatausing
- ETableOption
MySQL create table option
- ETokenStatus
Used by parser internally.
- ETokenType
Type of source token.