Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Some numbers that indicate PostgreSQL performance using JDBC to query and insert data. All of the tests done on the same quiescent machine.

...

  • Inserted 1000000 rows in 2535 milliseconds, 394477.3175542406 rows per second
  • Inserted 1000000 rows in 2553 milliseconds, 391696.0438699569 rows per second

You do need a recent driver for this, apparently; I tested this with version 42.1.1 of the JDBC driver on PostgreSQL 9.6.3.

This is quite an increase! Thanks a lot, Babu!

...

Writing the same in Oracle is quite a lot faster. Without indexes and PK, but with full transaction logging the performance is:

  • Inserted 1000000 rows in 1072 milliseconds, 1000000.0 rows per second
  • Inserted 24000000 rows in 47155 milliseconds, 510638.2978723404 rows per second

The fastest way so far: using the postgreSQL COPY command

The fastest method I found so far do do inserts is by using PostgreSQL's copy command. I rewrote the earlier Java insert code as follows:




Code Block
/**
 * @author <a href="mailto:jal@etc.to">Frits Jalvingh</a>
 * Created on 8-6-17.
 */
import org.postgresql.*;
import org.postgresql.copy.*;

import java.io.*;
import java.sql.*;
import java.util.*;

public class PgCopyTest {
   public static void main(String args[]) {
      try {
         Class.forName("org.postgresql.Driver");
         String dbURL = "jdbc:postgresql://localhost:5432/datavault_12_tst";
         Properties parameters = new Properties();
         parameters.put("user", "jal");
         parameters.put("password", "jal");

         Connection conn = DriverManager.getConnection(dbURL, parameters);
         conn.setAutoCommit(false);                // Important

         String query = "insert into datavault.h_test(id_h,source_system_id,organisation_id,load_dts,boekingdetailid) values(?,?,?,?,?)";
         long lStartTime = System.currentTimeMillis();
         int size = 24_000_000;
         int commitSize = 50_000_000;
         int commitCount = 0;
         int batchSize = 10000;
         int batchCount = 0;


         CopyManager cm = ((PGConnection) conn).getCopyAPI();

         cm.copyIn("copy datavault.h_test from stdin with (format text)", new InsertReader(size));
         conn.commit();

         long lEndTime = System.currentTimeMillis();
         long difference = lEndTime - lStartTime;
         double rps = (double) size / ((double)difference / 1000);
         System.out.println("Inserted " + size + " rows in " + difference + " milliseconds, " + rps + " rows per second");
      } catch(Exception e) {
         e.printStackTrace();
      }
   }
}

class InsertReader extends Reader {
   private final int m_size;

   private final StringBuilder m_buffer = new StringBuilder();

   private int m_bufferIndex;

   private int m_count;

   public InsertReader(int count) {
      m_size = count;
   }

   @Override public int read(char[] cbuf, int off, int len) throws IOException {
      if(m_count >= m_size)
         return -1;
      int buflen = m_buffer.length();
      int remainder = buflen - m_bufferIndex;
      if(remainder < len) {
         generateRecords();
         buflen = m_buffer.length();
         remainder = buflen - m_bufferIndex;
      }

      if(remainder > len)
         remainder = len;
      m_buffer.getChars(m_bufferIndex, m_bufferIndex + remainder, cbuf, off);
      m_bufferIndex += remainder;
      return remainder;
   }

   private void generateRecords() {
      m_buffer.replace(0, m_bufferIndex, "");
      m_bufferIndex = 0;
      int left = m_size - m_count;
      if(left > 1000)
         left = 1000;

      while(left-- > 0) {
         addRecord(m_count++);
      }
   }

   private void addRecord(int nr) {
      m_buffer
         .append(Integer.toString(nr))
         .append("\t")
         .append("1")
         .append("\t")
         .append("12")
         .append("\t")
         .append("2017-06-09 00:00:00")
         .append("\t")
         .append("boe" + nr + "king" + nr)
         .append("\n");
   }

   @Override public void close() throws IOException {
   }
}

...

  • The format to write each record's data in is error prone, as we need exact conversions for all possible types. Dates, for instance, are hard: you need to use the same format as defined on the db server, or set the server to some known format (which means other parallel tasks might suffer).
  • It is hard to handle transactions.


Stuff that has little or no effect

Parameter changes

For these tests I set the following postgreSQL config parameters:

ParameterDescription
shared_buffers = 512MBChanging this value to 8GB has no measurable effect. This seems logical as inserts should not use that much buffer space - since a commit is included the to-disk time is the limiting factor.


Other parameters that might be of importance were set as follows:


work_mem 512MB
synchronous_commit off
commit_delay 100000
autovacuum_naptime 10min

The latter was done because I noticed autovacuum running several times during those inserts. That seemed like a bad plan.

Using unlogged tables

Setting the table to unlogged had no apparent effect.


...