What objects/functions are available for custom data extractors?


From looking at the included custom data extractors (e.g. HTML-JavaScript.html.js), I can see that the objects/variables TRANSPOSED, COLUMNS, ROWS and FORMATTER are available.

Is there any documentation on these objects?
What methods are available on them?
Are there other objects that are available?

Comment actions Permalink

Hi Vidar!

This API is experimental, it is not published yet, and it is subject to change.

But here is some overview on what there is:

Binding Type Description
Database dialect
com.intellij.database.model.DasTable Database table metadata
java.util.List<? extends com.intellij.database.extractors.DataColumn>  
A list of all columns in query result/currently open table editor.
com.intellij.database.extractors.DataColumn exposes
int columnNumber();
String name();
java.util.List<? extends com.intellij.database.extractors.DataColumn>  
A list of selected columns in query result/currently open table editor.

Exposes default formatting methods:
java.lang.String format(com.intellij.database.extractors.DataRow, com.intellij.database.extractors.DataColumn);
java.lang.String format(java.lang.Object, com.intellij.database.extractors.DataColumn);

OUT java.lang.Appendable Extractor output
ROWS java.lang.Iterable<com.intellij.database.extractors.DataRow>

Row data stream.
com.intellij.database.extractors.DataRow exposes:

int rowNumber();
boolean first
boolean last
java.util.List<java.lang.Object> data();
Object value(com.intellij.database.extractors.DataColumn);
TRANSPOSED java.lang.Boolean Indicates whether the table editor is in transposed mode.

I hope, it helps. Feel free to ask questions and share your ideas on how to improve this feature.

Comment actions Permalink

Good news!

Starting with the 1.0 Public Preview build, we're shipping sources for open API parts of the product.

They can be found in %IDE_HOME%/lib/src/src_database-openapi.zip.

Happy hacking!

Comment actions Permalink

Great, thanks!

Is the source code or API of those com.intellij.database.* classes (like com.intellij.database.model.DasTable) available somewhere?

Comment actions Permalink

Not yet, but they will become a part of public API, too.

Until then, if you need help figuring out something - just ask, and we'll try to help you.

BTW, what kind of extractor are you trying to build?

Comment actions Permalink

Currently I'm building an extractor that will provide data for our DB test runner, on the form

<TABLENAME COLUMN1="column1value" COLUMN2="column2value"/>

- which I just now succeeded in doing, thanks to TABLE.name. Thanks!

Previously I wrote an extractor that outputs the resultset in JIRA markup table format:

|| COLUMN1 || COLUMN2 ||
|  R1C1 | R1C2 |
|  R2C1 | R2C2|

I'll post the code for the JIRA extractor here, in case it can help anyone:

var NEWLINE = "\n";

function output() {
  for (var i = 0; i < arguments.length; i++) {

var headerIterator = COLUMNS.iterator();
while (headerIterator.hasNext()) {
  var col = headerIterator.next();

var rowIterator = ROWS.iterator();
while (rowIterator.hasNext()) {
  var row = rowIterator.next();
  var columnIterator = COLUMNS.iterator();
  while (columnIterator.hasNext()) {
    var column = columnIterator.next();
    output(FORMATTER.format(row, column));

Date formatting will probably be an issue soon.

Where does FORMATTER get its date format from, when formatting an Oracle DATE value, for example?

Comment actions Permalink

For now, it doesn't do anything clever - it just uses "yyyy-MM-dd HH:mm:ss" for timestamps, "yyyy-MM-dd" for dates, and "HH:mm:ss" for times.

So you may want to use the instanceof operator to dispatch on value types and format them the way you want.

Comment actions Permalink

Woohooo! :)

Comment actions Permalink

If anyone is interested: I've written a MySQL like output format the other day. For small datasets it makes it easy to communicate the values using fixed width font:

| sapID | fulfilment_type | start_time          |
| 3595  | COLLECTION      | 1970-01-05 08:00:00 |
| 3595  | COLLECTION      | 1970-01-05 10:00:00 |
| 3595  | COLLECTION      | 1970-01-05 12:00:00 |
| 3595  | COLLECTION      | 1970-01-05 14:00:00 |
| 3595  | COLLECTION      | 1970-01-05 16:00:00 |
5 rows in set.

You can find it here https://github.com/aukevanleeuwen/intellij-extractors 

Comment actions Permalink

@Sergey Savenko thanks, but how to import the package?

i use

import com.intellij.database.model.DasTable

but it

Comment actions Permalink

Unfortunately, %IDE_HOME%/lib/src/src_database-openapi.zip was included in datagrip-2016.1.exe, but is missing in datagrip-2016.3.2.exe and later versions.

And com.intellij.database java sources aren't still published in any googlable place.

Comment actions Permalink

When highlighting a sql select statement and using the execute to file/SQL Inserts method, the TABLE object supplied to the extractors is null and all insert statements are using "MY_TABLE" as the table name. 



However, when executing the same statement without export, then exporting from the results pane, the table object gets supplied to the sql inserts script and the insert statements are using the correct table name.  

insert into report.product( ...
insert into report.product( ...
insert into report.product( ...


Is there a way to get the table name for an extractor when executing to file?

I also notice that when using the same export/extractor method (SQL Inserts), the sql statement case when using the method from the results pane (lower) is different from when applied directly to a sql statement.   I feel like this is a clue, but I don't know where to go from here.  Thanks!



Comment actions Permalink

Howdy, is there a way of dumping multiple tabs or the output of the entire script file to the same out file?

Comment actions Permalink

I am writing a Data Extractor to  automatically create a DataFixture for PHP based on a query, it is mostly finished, but I am having some issues with timestamps and want to know the type of the column (to detect timestamp), can you give me a clue please?

I have tried something like this:

def t = COLUMNS[idx].type().contains("timestamp")


Comment actions Permalink

I am trying to write an Oracle SQL multireow insert extractor - almost finished, but the last detail is the date columns. Right now I have to mannualy (outside Datagrip) convert "2020-12-16 10:21:11.876990" to "TO_DATE("2020-12-16 10:21", "YYYY-MM-DD HH24:MI:SS")".

I can make a regular expression test to check if it´s a date, but it would be much cleaner if I had the type of the column, that I believe is in the com.intellij.database.extractors.DataColumnImpl, but I can´t find any documentation on the class.

Anyone knows where I can find these docs ?



Comment actions Permalink

In the 2020.3 release, the documentation in the extractor header was updated to add getTypeName and isStringLiteral.

FORMATTER { format(row, col); formatValue(Object, col); getTypeName(Object, col); isStringLiteral(Object, col); }

getTypeName(object, col) will return the datatype of a column as reported from the db.  Although row and column are required here from what I can tell this is the datatype of the column not of the row.  If you're using SQLite null column with multiple datatypes the info from the first row will be returned.

I use Oracle 12c, SQL Server, and SQLite.  Below are datatypes I extracted from my db.

SQL SERVER strings[binary, char, image, nchar, ntext, nvarchar, text, varbinary, varchar]
numbers[bigint, decimal, float, int, int identity, money, numeric, real, smallint, smallmoney, tinyint]
date [datetime, smalldatetime, timestamp]
other [bit, sql_variant, uniqueidentifer]
date datatypes not supported (always returned as nvarchar) [datetime2, date, time, datetimeoffset]

oracle strings[blob, char, clob, long raw, nchar, nclob, nvarchar2, raw, rowid, varchar2]
numbers[binary_double, binary_float, number]
date [date, intervalds, intervalym, timestamp, timestamp with local time zone, timestamp with time zone]
other [bfile, sys.xmltype]

sqlite strings[blob, char, character, clob, native character, nchar, nvarchar, string, text, varchar, varying character]
numbers[bigint, decimal, double, double precision, float, int, int2, int8, integer, mediumint, numeric, real, smallint, tinyint, unsigned big integer]
date [date, datetime, time]
other [boolean, none]
dynamic datatypes (unknown) not supported - always returned as datatype of first record instead of result of typeof(<field>)


Here is an example of usage from my MERGE extractor:

    ALL_COLUMNS.eachWithIndex { column, idx ->
def value = dataRow.value(column)
def isStringLiteral = value != null && FORMATTER.isStringLiteral(value, column)
def stringValue = FORMATTER.format(dataRow, column)
def TypeName = FORMATTER.getTypeName(dataRow, column)
bIsOracle = DIALECT.getDbms().isOracle()
if (bIsOracle && value != null && TypeName.toUpperCase() in ["DATE", "TIMESTAMP", "TIMESTAMP WITH LOCAL TIME ZONE", "TIMESTAMP WITH TIME ZONE"] ) {
.append("TO_DATE('" + stringValue + "', 'YYYY-MM-DD HH24:MI:SS')")
.append(idx != ALL_COLUMNS.size() - 1 ? " AS " + arrColName[idx] + SEP + NEWLINE : " AS " + arrColName[idx] + NEWLINE)
} else {
.append(isStringLiteral ? (STRING_PREFIX + QUOTE) : "")
.append(stringValue ? stringValue.replace(QUOTE, QUOTE + QUOTE) : stringValue)
.append(isStringLiteral ? QUOTE : "")
.append(idx != ALL_COLUMNS.size() - 1 ? " AS " + arrColName[idx] + SEP + NEWLINE : " AS " + arrColName[idx] + NEWLINE)

isStringLiteral(object, col) will return a boolean indicating the field is a string datatype.  This came about due to false-positive when trying to identify numeric fields.  Before we had to convert to a string then check to see if that string was a number resulting in account number fields being converted.

Old Way:

New Way:
def isStringLiteral = value != null && FORMATTER.isStringLiteral(value, column)



Please sign in to leave a comment.