Category Archives: Notes to Myself

Things I’ve figured out and want to remember how to do the next time around.

Using JdbcTemplate in Spring Boot to Insert a Record and Return a ID

There is more than one way to skin a cat (pardon to my cats). In fact, to insert a record into a database within Spring Boot, I found at least 6 ways:

  • StoredProcedureQuery
  • Prepared Statements (with various flavors)
  • Callable Statements (with various flavors)
  • JdbcTemplates
  • JdbcTemplates with Prepared Statements
  • JdbcTemplates with Callable Statements

There were probably more. But I didn’t save my search history to find them all again.

I have an unique situation in which I’m working. The database on the backend is Microsoft SQL Server. The insert statement is using a Stored Procedure. And the stored procedure is returning the record id by calling @@IDENTITY itself.

Oftentimes, when one is working with MSSQL, one is working with the entire Microsoft stack, which includes some flavor of .Net. But the client wanted to have all their middleware code standardized to Spring Boot. Not a big deal, except for the fact that there were existing Stored Procedures to work with, not the database tables directly.

This has provided a few unique challenges. Rather than using Spring Boot’s ready-made method for automatically creating models for databases, the models had to be created by hand. And I learned early on that, when using StoredProcedureQuery, ‘column names’ wouldn’t work: the parameters had to be entered in the order expected by the stored procedure, and indices used instead.

That last item took a day to figure out.This latest issue took just about as long. And hopefully, some poor soul will find this post to aid in their quest for the same information.

That problem was inserting a record and returning the new id for that record. The first problem was dealing with null values. StoredProcedureQuery had been infinitely useful when running stored procedures that extracted data. It became almost cookie cutter to create the code:

StoredProcedureQuery query = [EntityManager].createStoredProcedureQuery("[stored procedure name]", [Model class].class);
query.registerStoredProcedureParameter([index], [typeclass].class, ParameterMode.IN [or] ParameterMode.OUT);
query.setParameter([index],[value]);
return query.getSingleResult(); [or] return query.getResultList();

But this wouldn’t work for inserts. Initially, my problem with inserts was that they could contain null values. StoredProcedureQuery didn’t like null values. And, after much research and trial and error, I found that JdbcTemplate was my friend. By using JdbcTemplate, I could add null parameters and the code wouldn’t choke on me. Problem was, JdbcTemplate.update, which I was using, would return the number of records affected, but not the new ID, if a record was being inserted. Thus began my long search for something else that would work.

The popular response in Stack Overflow was to use:


KeyHolder holder = new GeneratedKeyHolder();

I couldn’t make that work. I’d get errors because ‘nothing’ was being returned.

I checked all the other options on JdbcTemplate, since that still seemed my best option for accomplishing this (since I might have nulls in the insert parameters). I literally would check the documentation for each option, then Google to see how they were used.

I finally came across the documentation for CallableStatement. It is passed as a parameter in a call from JdbcTemplate, so my null values could be handled. And it allowed a return value to be ‘registered’ for retrieval. That is, since I was not retrieving a value with a column or variable name, I could set one in order to find the value after the call had been made.

My thanks to these two sources to helping me out:

http://www.java2s.com/Code/Java/Spring/ImplementsCallableStatementCreator.htm
http://forum.spring.io/forum/spring-projects/data/239-jdbctemplate-call-example

Here’s the basic syntax that I ended up using:


import java.sql.CallableStatement;
import java.sql.Connection;
import java.sql.SQLException;
import java.sql.Types;
import java.util.ArrayList;
import java util.Map;

import org.springframework.jdbc.core.CallableStatementCreator;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.jdbc.core.SqlOutParameter;
import org.springframework.jdbc.core.SqlParameter;

// note: I did have to list all my input and output parameter types, unlike in
// the examples where the list only included the output type
List out = new ArrayList();
out.add(new SqlParameter(1, Types.NVARCHAR));
out.add(new SqlParameter(2, Types.INTEGER));
out.add(new SqlOutParameter(“id”,Types.NUMERIC));

Map<String, Object> returnList
= getJdbcTemplate().call(new MyCallableStatementCreator(params), out);

…..

class MyCallableStatementCreator implements CallableStatementCreator {

private ParamModel params;

public MyCallableStatementCreator(ParamModel param) {
this.params = params;
}
@Override
public CallableStatement createCallableStatement(Connection con) throws SQLException {
// note: I received an error until I put the brackets around the call, and the word ‘call’ was required
CallableStatement stmt = con.prepareCall(“{call my_stored_procedure(?,?,?) }”);

stmt.setString(1,params.getFirstVal());
stmt.setInt(2, params.getSecondVal());
stmt.registerOutParameter(3, Types.INTEGER);
return stmt;
}

Oracle WebCenter VM

I just started working with Oracle’s VirtualBox VM for WebCenter. Much easier for development – everything (all 6 apps!) are already installed. But I ran into a problem updating it. I received the following message:

[Errno 256] No more mirrors to try

Unfortunately, the one query I found about it on the Oracle mailing lists only said that the message referred to the fact that the user was trying to reach a link inside of Oracle when he was neither an employee or a contractor. And there wasn’t an answer listed to how to fix the issue.

The discussion was closed. Hopefully, someone will find this post if they ever run across the same situation.

The error is due to the fact that yum is set up to try to go through an Oracle proxy to get to the yum servers. To fix the error, open /etc/yum.conf and either remove or comment out the line with the proxy. After that, it works just fine.

Python and ElasticSearch

I’ve recently been working with a project that involved getting data out of an AWS ElasticSearch instance. I hearken back to my early days with Lucene, before Big Data was a buzzword.

I chose to use Python to accomplish this task, because it was to be a ‘quick’ item, so I didn’t want to use something heavy, like Java. And I wanted to become more familiar with Python.

Googling it, and checking StackOverflow, Python ElasticSearch and ElasticSearch-DSL seemed the modules best suited for accomplishing my goal. Problem was, although the documentation seemed extensive, the examples were limited, making it often difficult to accomplish what I needed to. Now that I’ve created several successful scripts using this library, I wanted to share them with everyone.

First Attempt

My first task included a number of aggregations that would be determined at runtime. I couldn’t find a way of making this work with the standard syntax provided. But there is a way to create a JSON object, then convert it for use in Python ElasticSearch. I found this code, which allowed me to create dictionaries for trees. It can then be converted to JSON using json.dumps. And the JSON object can then be fed into ElasticSearch.

My loop. Using this structure, I could modify the values at runtime:

 for i in range (1,len(mn_timestamps)):
 aggs['aggregations']['by_account']['aggs']['by_month']['filters']['filters'][mn_timestamps[i -1].label]['range']['execution'] = 'index'
 aggs['aggregations']['by_account']['aggs']['by_month']['filters']['filters'][mn_timestamps[i -1].label]['range']['TIMESTAMP']['gte'] = mn_timestamps[i - 1].timestamp
 aggs['aggregations']['by_account']['aggs']['by_month']['filters']['filters'][mn_timestamps[i -1].label]['range']['TIMESTAMP']['lt'] = mn_timestamps[i].timestamp

As stated in the documentation, terms that don’t exist are created when listed:

 search['query']['filtered']['query']['match_all']

Here’s the other instance I found this handy at this point. I couldn’t figure out how to get multiple filters added to the query. The dictionary to json method allowed me to do this easily. Notice that there’s another runtime variable. But since it’s only used once, I didn’t need the loop strucutre:

 timestamp = "{\'range\': { \'TIMESTAMP\': { \'gte\':" + str(mn_timestamps[0].timestamp) + ", \'lt\':" + str(mn_timestamps[ len(mn_timestamps) - 1].timestamp) + "}, \'execution\': \'index\' } }"
 hierarchy = "{ \'terms\': { \'hierarchy\': [\'" + account + "\'] } }"
 terms = "{ \'terms\': { \'VIEW_TYPE\': [ \'t1\', \'t2\',\'t3\' ] } }"
 search['query']['filtered']['filter']['bool']['must'] = [timestamp, hierarchy, terms]
 aggs['aggregations']['by_account']['terms']['field'] = 'ACCT_ID'
 aggs['aggregations']['by_account']['terms']['size'] = 0

Now convert the query from a dictionary/tree object into JSON:

 query = json.dumps(search) + json.dumps(aggs)

Here’s the original output:

{"query": {"filtered": {"filter": {"bool": {"must": ["{'range': { 'TIMESTAMP': { 'gte':1420070400000,
 'lt':1435708800000}, 'execution': 'index' } }", "{ 'terms': { 'hierarchy': ['account'] } }", "
{ 'terms': { 'VIEW_TYPE': [ 't1', 't2','t3' ] } }"]}}, "query": {"match_all": {}}}}}{"aggregations": 
{"by_account": {"terms": {"field": "ACCT_ID", "size": 0}, "aggs": {"by_month": {"filters": {"filters": {"201502": 
{"range": {"TIMESTAMP": {"lt": 1425168000000, "gte": 1422748800000}, "execution": "index"}}, "201503": {"range": 
{"TIMESTAMP": {"lt": 1427846400000, "gte": 1425168000000}, "execution": "index"}}, "201501": {"range": {"TIMESTAMP": 
{"lt": 1422748800000, "gte": 1420070400000}, "execution": "index"}}, "201504": {"range": {"TIMESTAMP": {"lt": 
1430438400000, "gte": 1427846400000}, "execution": "index"}}, "201505": {"range": {"TIMESTAMP": {"lt": 1435708800000, 
"gte": 1430438400000}, "execution": "index"}}}}}}}}}

Now the odd thing. I found that the object produced couldn’t be used directly. A few substitutions had to be done:

 query = query.replace("\"{","{")
 query = query.replace("}\"","}")
 query = query.replace("\'","\"")
 query = query.replace("}{",",")


Here’s the changed output. Note the bolded differences from above:

{"query": {"filtered": {"filter": {"bool": {"must": [{"range": { "TIMESTAMP": { "gte":1420070400000, 
"lt":1435708800000}, "execution": "index" } }, { "terms": { "hierarchy": ["account"] } }, { "terms": { "VIEW_TYPE": [ 
"t1", "t2","t3" ] } }]}}, "query": {"match_all": {}}}},"aggregations": {"by_account": {"terms": {"field": "ACCT_ID", 
"size": 0}, "aggs": {"by_month": {"filters": {"filters": {"201502": {"range": {"TIMESTAMP": {"lt": 1425168000000, 
"gte": 1422748800000}, "execution": "index"}}, "201503": {"range": {"TIMESTAMP": {"lt": 1427846400000, "gte": 
1425168000000}, "execution": "index"}}, "201501": {"range": {"TIMESTAMP": {"lt": 1422748800000, "gte": 1420070400000}, 
"execution": "index"}}, "201504": {"range": {"TIMESTAMP": {"lt": 1430438400000, "gte": 1427846400000}, "execution": 
"index"}}, "201505": {"range": {"TIMESTAMP": {"lt": 1435708800000, "gte": 1430438400000}, "execution": "index"}}}}}}}}}

Get the dictionary into our elasticsearch ‘terminology’ and run:

 qdict = ast.literal_eval(query)
 
 s = elasticsearch_dsl.Search(using=eclient, index='my_index').from_dict(qdict)
 response = s.execute()
Easy Queries

As I mentioned before, I had experience with Lucene queries from the past. So the next item to tackle was pretty easy for me. I used a simple Lucene query to accomplish my task:

 

query = "hierarchy: \"{0}\" AND TIMESTAMP: [{1} TO {2}]  AND (VIEW_TYPE: t1 OR VIEW_TYPE: t2 OR VIEW_TYPE:t3)".format(account, t_start, t_end)
    res = eclient.search(index=indx, q=query)

This query was the same as one would use in a Kibana interface to ElasticSearch.

Getting Around Pre-Determined Limits

The next hurdle I had to overcome was the limit of how many records could be returned: 10. Here, I had to go to the ElasticSearch documentation for an answer.  Again, I had to create a custom JSON object to handle the issue:

    query = '{"size":0,"query": {"filtered": {"filter": {"range": {"TIMESTAMP":'\
        + ' {"lt":' + str(end_date) +', "gte":' + str(start_date) + '}}}, "query": '\
        + '{"bool": {"must_not": [{"terms": {"VIEW_TYPE": ["t1", ' \
        + '"t2", "t3", "t4", "t5", "t5", "t6", "t6"]}}], "must": '\
        + '[{"match": {"hierarchy": "' + account + '"}}]}}}}, "aggs": {"by_account":'\
        + ' {"terms": {"field": "hierarchy", "size":300}, "aggs": {"by_product": {"terms": '\
        + '{"field": "PRODUCT", "size":30}}}}}}'
        
    print(query)
        
    response = eclient.search(body=query, index=indx)

Notice the size parameter added to the query.

Did I ever use the built-in ElasticSearch-DSL query builder? Well, I played with it. But I couldn’t necessarily find the specific quirks I needed handled. Here’s an example:

search = elasticsearch_dsl.Search().using(eclient).index(indx
    ).query("match", hierarchy=account
    ).query("bool", must_not=Q('terms', VIEW_TYPE=['t1','t2','t3','t4','t5','t6','t7','t8'])
    ).filter("range", TIMESTAMP={'gte': start_date,'lt': end_date})
          
search.aggs.bucket('by_account', 'terms',field='hierarchy')\
    .bucket('by_product','terms', field='PRODUCT')

This was the query I used, before finding that I had a 10 record limit on returns.