Using Database Console results in Script

Created November 14, 2019 18:39

I have a python3 project I am working on that requires grabbing data from a database. Right now, I am having errors using pyodbc because the query is long (multiple temp tables, and includes a recursive statement) and having trouble figuring out where it fails. I am trying to figure out the best way to proceed from here. I figure I have two options at the moment. 1) figure out why pyodbc is having troubles. 2)See if there is an alternative tool.

I came across the DB console as a potential alternate tool, which seems to actually run more of the query than pyodbc. I need the data resutls for another script I will run. Can I run the query from DB console as part of my project, then have my python script use the results to ultimately put them in a pandas dataframe?

Or does the DB console only allow saving the results?

If the DB console only allows saving results, is there a way to spin up a DB console as part of my project, have it run the query and save results to a file, and then have another python script open that file using pd.read_csv()

5 comments

Med1n18

Created May 15, 2020 10:56

Brian - did you ever figure this one out? I'm stuck on exactly the same thing presently. Mark

Andrey Resler

Created May 15, 2020 13:33

I doubt this is possible from IDE. I think the better approach would be a combination of shell script + database's native tool to run queries.

Med1n18

Created May 15, 2020 13:39

Andrey- thank you for writing back. Really very much appreciated.

I'm very new to this so any advice is really helpful.

So, to clarify, you would suggest a .py script using PyMongo to query the db?

Andrey Resler

Created May 15, 2020 14:01

Well, there's many approaches. If you're working with MongoDB, PyMongo is preferred. You can also output your query to a file and then call mongo tool from python as a system command to run that query.

Brian Hunt

Created May 15, 2020 14:14

I finally got pyodbc to run. In the case of a long query, I had to break it into parts.

So in file1.sql I put 2 or 3 temp tables

in file2.sql I put 2-3 more etc...

At the top of each file (maybe only required in the first, I'm actually not sure-I did all just to be safe and it was more of a one off project) you need to put "SET NOCOUNT ON;" at the top. Basically that line says don't spit out intermediate results that mess up pyodbc.

In other words, it tells sql that I'm not finished yet is my novice interpretation.

In my .py script, I then put:

con = pyodbc.connect(conn_string)
cursor = con.cursor()

with open() as p1,
     open() as p2...
sql1 = p1.read()
sql2 = p2.read()

#Finally using those partial queries (complete sets of temp tables)

cursor.execute(sql1)
cursor.execute(sql2)
...

For the results, I put in my last query file saved into the string "sql6" contains: 

select * from ##my_last_temp_table #global temp table.

#And inside my script:
df = pd.read_sql(sql6, con)

I don't know if this will help or contains a similar situtation as you, but by breaking up the query, somehow this enabled pyodbc to not error out. I suspect, pyodbc has trouble parsing long queries or interpreted something inside incorrectly. I never really debugged why.

Please sign in to leave a comment.