NIO Performance

Hi there,
I have a SocketChannel connection from a client to a server and both are running on the same machine (a Windows machine). I send some data (around 100 bytes) from the client to the server, which takes less than 1ms to complete the SocketChannel.write(). The server reads the data and immediately writes back a reply, which is just 4 bytes.
The problem is that it seems to take a long time for the data to get from the client to the server. Using the nanosecond timer I measured the time that the write completes on the client below, and when the server receives it.
                                                difference
client write complete           1649499857150061
server first 4 bytes read       1649499870403961 13,253,900
server read and write complete 1649499873373763 2,969,802There is a good 13ms delay from writing on the client to receiving on the server.
The server implementation is such that it constantly does non-blocking SocketChannel.read()'s with no delays inbetween, such that I should get data from the socket as soon as it's available.
13ms is a long time. I'm sending lots of small packets of data with handshaking, and a 13ms delay in both directions for each messages causes a massive slow down.
Has anybody got any idea what's going on here, and why it takes so long? My guess is that there's some latency in either the Java socket implementation or in the Windows socket librarys. Or it might be my fault...
Cheers,
Chris.

TCP/IP contains more latency than this by definition. The Nagle algorithm alone introduces latency of up to 200ms. You could try turning it off at the sending end.
But your strategy of spinning in a non-blocking read loop is also probably causing more problems than it's solving: it's just burning up cycles which may be needed elsewhere. Use Selector.select() or blocking mode, so that you're woken up when data arrives without frying the CPU. and stealing valuable cycles.

Similar Messages

NIO Performance Test Result

Dear forum users.
I wonder why "New I/O"(java.nio.*) is usefuel ?
I tested "New I/O" performance.
plz, see the code below..
public class ByteBufferPerformanceTest {
     public static void main(String[] args) {
          File fileName = new File("c:\\kandroid_book_3rd_edition[1].pdf");   // 20MB file
          // ByteBuffer Usage
          long start1 = System.nanoTime();
          try {
               int data=0;
               FileInputStream fis = new FileInputStream(fileName);
               FileChannel fc = fis.getChannel();
               ByteBuffer bf = ByteBuffer.allocateDirect(1024);
               while( fc.read(bf) != -1 ) {
                    //System.out.print(new String(bf.array(), 0, 1024));
                    bf.clear();
               fis.close();
          } catch (FileNotFoundException ffe) {
               ffe.getStackTrace();
          } catch (IOException ioe) {
               ioe.getStackTrace();
          long duration1 = System.nanoTime() - start1;
          // BufferedInputStream Usage
          BufferedInputStream bin = null;
          long start2 = System.nanoTime();
          try {
                 bin = new BufferedInputStream( new FileInputStream(fileName) );
                 byte[] contents = new byte[1024];
                 int bytesRead = 0;
                 String strFileContents;
                 while ((bytesRead = bin.read(contents)) != -1) {
                     //strFileContents = new String(contents, 0, bytesRead);
                     //System.out.print(strFileContents);
          } catch (FileNotFoundException ffe ) {
               ffe.getStackTrace();
          } catch ( IOException ioe ) {
               ioe.getStackTrace();
          } finally {
               try {
                    if ( bin != null)
                         bin.close();
               }catch ( IOException e ) {
                    e.getStackTrace();
          long duration2 = System.nanoTime() - start2;
          // FileReader Usage
          long start3 = System.nanoTime();
          try {
               FileReader fr = new FileReader(fileName);
               BufferedReader br = new BufferedReader(fr);
               String line;
               while( (line=br.readLine())!=null ) {
               br.close();
          } catch ( FileNotFoundException ffe) {
               ffe.getStackTrace();
          } catch ( IOException ioe ) {
               ioe.getStackTrace();
          long duration3 = System.nanoTime() - start3;
          System.out.println(String.format("%20s : %12d", "ByteBuffer", duration1));
          System.out.println(String.format("%20s : %12d", "BufferedInputStream", duration2));
          System.out.println(String.format("%20s : %12d", "FileReader", duration3));
}Result(nanoTime)
ByteBuffer : 60107360
BufferedInputStream : 22748701
FileReader : 597288203
result.. as you can see, best class for file I/O is BufferedInputStream.
so i mean, why is it need to use ByteBuffer ?
am i tested it wrong way ?
thx, for reading. thank you very much ..:)

First of all: you're test is very, very flawed in multiple ways:
1.) You try reading the same file 3 times. The first one will take the cache hit, while the OS actually loads the file from the disk and the others will just test how fast accessing the OS cache is
2.) You're only doing a single read of the file and didn't tell us if you tried that experiment multiple times (to avoid small timing difference to influence the result)
3.) your three methods do different things. Specifically the last method converts the bytes to Strings which is meaningless for a binary file and takes additional time.
All that being said: NIO isn't simply "faster". It provides ways to implement non-blocking IO for tasks such as servers supporting a massive amount of connections and similar high-performance scenarios. If you simply want to read a file once, then "normal" IO will be perfectly fine for you.

Java NIO, ByteBuffers and Linksys router

I have a client server app/game that uses NIO for communication sending ByteBuffers. On a LAN with 5-8 users it runs great. On the internet, through a Linksys router, with one user, it has a blip. I get all my data transmissions except for one buffer. Whenever I chat the buffer contains a size, an int typeID and the encoded string for chat. This particular buffer never makes it to the client on the outside of the router. I have a port forwarded and regular tcp/ip java io sockets stuff works fine. As does al lof the other NIO buffer traffic for locational data, login in and out, etc... ANy thoughts??

But not sure what would be the performance of those clients?? when compared to Java NIO performance....Telnet isn't a high-performance protocol anyway. Don't worry about it. Use existing code. Get it working. Measure. If you have a performance issue, then worry, while at least you have something you can deploy. It won't be a problem. The router is there to route, not to talk high-speed telnet.

Help! My application uses a Single Thread !

Hi all !
I have a web application which performs some long running tasks. This can be easily simulated with:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
          System.out.println("Started Long Running Task!");
          try {
               Thread.sleep(20000);
          } catch (InterruptedException e) {
               e.printStackTrace();
          System.out.println("Done");
In order to deal with Long Running Tasks, I have created a WorkManager with MinThreads 10 and MaxThreads 100
Then I have assigned the Work Manager to the Web application usign weblogic.xml:
<?xml version="1.0" encoding="UTF-8"?>
<weblogic-web-app xmlns="http://www.bea.com/ns/weblogic/90">
<wl-dispatch-policy>WorkManager-0</wl-dispatch-policy>
</weblogic-web-app>
However it seems that the Web application uses a SINGLE Thread to reply to the Servlet. In other words issuing n parallel requests the output is:
Started Long Running Task!
[20 Seconds Pause]
Started Long Running Task!
[20 Seconds Pause]
Started Long Running Task!
[20 Seconds Pause]
Started Long Running Task!
[20 Seconds Pause]
My settings are the default Weblogic 12c Server settings, I've just added Weblogic NIO performance libs to the Java's path.
Is there any setting which allow just 1 Socket for my application ? Maybe it's because I'm using the "unlicensed" (free download) server version ?
Thanks a lot
Frank

You need to create separate Windows user accounts if you want to seperate the behaviour of iTunes for each user. That also means separate iTunes libraries for each user.
Windows is a multi-user operating system but you are not using it properly. iTunes is not a multi-user application. No application is. You can't expect it to treat different users differently when they are all using the same computer user account.
Do you understand what I mean?

Selectable SocketChannels or httpurlconnection s?

I?m developing a multithreaded server for sending sms over http.
This will involve various message payloads of various types files / texts etc
Sent to multiple different http destinations via post / get requests.
The system will have to degrade gracefully and respond intelligently to events such as destinations not being reachable, message loads increasing and latency increasing on requests, server 500 http rersponses and 401 not authorised etc.
Ideally I?d like to have the threads use form of ?sleep until I get a response / connection from the remote http server? as this would be optimal.
Currently I?m considering using either
new HttpUrlConnections per message sent
Or
Selectable SocketChannels
As they offer what looks like the type of blocking action I?m looking for.
Can anyone enlighten me as to which method of connecting would give optimal performance and how best to go about it.
Also is HttpUrlConnection thread safe if I create a new instance per message per thread?
I?ve read Doug Lea?s book and have a fair idea of how I?m going to do the threading its just the connection part that I?m wary of.
Thanks,
AnRonMor

I'd recommend the HttpURLConnection. It provides a number of useful methods for dealing with Http and if you plan to use one thread per connection, its fine. I'd use queues between the connection code and the mainline code to avoid any threading issues. Each connection will block independently.
If you use the NIO SelectableChannel you should use a single thread to perform all the i/o. At present the jury is still out on whether the cost of demultiplexing selections is less than the cost of multiple threads. Here's an interesting disussion of NIO Performance, also my thread Taming the NIO Circus provides examples of using NIO, including the queuing of interestOps as suggested in the other thread.

Are the experts wrong about non-blocking SocketChannels?

Everyone says to use the new SocketChannel and Selector for "highly scalable" server applications. So I ran a couple of tests using non-blocking (via Selector) and thread-per-connection SocketChannels.
The Selector version consumes 8x more cpu than the thread-per-connecton version!
Using JDK 1.4.1 FCS on Win2K, with 1000 socket connections each sending 1k bytes per second, the Selector version was consuming 40% cpu with 10 threads; the thread-per-connection version (using blocking SocketChannels) was only consuming about 5% cpu with 1009 threads.
So, are the experts wrong? Is there a performance problem when number of SocketChannels exceed a certain threshold?
For anyone interested, here's the source code:
Non-Blocking Server using Selector
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.net.Socket;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.nio.channels.FileChannel;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Collections;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Set;
import java.util.logging.Level;
import java.util.logging.Logger;
public class Server4 implements Runnable
private static int port = 80;
public static void main(String args[]) throws Exception
Server4 server = new Server4();
Thread thread = new Thread(server);
thread.setDaemon(true);
thread.start();
thread.join();
public Server4() throws IOException
ServerSocketChannel server = ServerSocketChannel.open();
InetSocketAddress isa = new InetSocketAddress(port);
server.configureBlocking(false);
server.socket().bind(isa);
m_selector = Selector.open();
server.register(m_selector, SelectionKey.OP_ACCEPT);
Charset utf8 = Charset.forName("UTF-8");
m_decoder = utf8.newDecoder();
m_encoder = utf8.newEncoder();
public void run()
int count = 0;
try
ByteBuffer buffer = ByteBuffer.allocateDirect(2048);
//FileOutputStream fos = new FileOutputStream("server4.dat");
//FileChannel fc = fos.getChannel();
while (m_selector.select() > 0)
Set keys = m_selector.selectedKeys();
for (Iterator itr = keys.iterator(); itr.hasNext(); )
SelectionKey key = (SelectionKey) itr.next();
itr.remove();
if (key.isAcceptable())
System.out.println("accept: " + (++count));
ServerSocketChannel server
= (ServerSocketChannel) key.channel();
SocketChannel channel = server.accept();
channel.configureBlocking(false);
channel.register(m_selector, SelectionKey.OP_READ);
else
SocketChannel channel = null;
try
if (key.isReadable())
channel = (SocketChannel) key.channel();
int bytes = channel.read(buffer);
if (bytes <= 0) // Linux does not throw IOException
channel.close(); // will also cancel key
System.out.println("connection closed " + count);
else
buffer.flip();
//fc.write(buffer);
buffer.clear();
catch (IOException ioe) // connection closed by client
System.out.println("readable: " + ioe.getMessage());
sm_logger.log(Level.INFO, ioe.getMessage(), ioe);
Throwable cause = ioe.getCause();
if (cause != null)
System.out.println("cause: "
+ cause.getClass().getName()
+ ": " + cause.getMessage());
channel.close(); // will also cancel key
--count;
catch (IOException e)
System.out.println("run: " + e.getMessage());
sm_logger.log(Level.SEVERE, e.getMessage(), e);
catch (Exception e)
System.out.println("run: " + e.getMessage());
sm_logger.log(Level.SEVERE, e.getMessage(), e);
private Selector m_selector;
private CharsetDecoder m_decoder;
private CharsetEncoder m_encoder;
private static Logger sm_logger = Logger.getLogger("Server");
Thread-Per-Connection Server
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.net.Socket;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.nio.channels.FileChannel;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Collections;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Set;
import java.util.logging.Level;
import java.util.logging.Logger;
public class MultiThreadServer implements Runnable
private static int port = 80;
public static void main(String[] args) throws Exception
ServerSocketChannel server = ServerSocketChannel.open();
InetSocketAddress isa = new InetSocketAddress(port);
server.socket().bind(isa);
int count = 0;
while (true)
SocketChannel channel = server.accept();
System.out.println("accept: " + (++count));
MultiThreadServer worker = new MultiThreadServer(channel);
Thread thread = new Thread(worker);
thread.setDaemon(true);
thread.start();
public MultiThreadServer(SocketChannel channel) throws IOException
m_channel = channel;
public void run()
ByteBuffer buffer = ByteBuffer.allocateDirect(2048);
int bytes = 0;
try
while ((bytes = m_channel.read(buffer)) > 0)
buffer.flip();
// process buffer
buffer.clear();
System.out.println("connection closed");
m_channel.close();
catch (IOException e)
System.out.println("run: " + e.getMessage());
sm_logger.log(Level.SEVERE, e.getMessage(), e);
catch (Exception e)
System.out.println("run: " + e.getMessage());
sm_logger.log(Level.SEVERE, e.getMessage(), e);
private SocketChannel m_channel;
private static Logger sm_logger = Logger.getLogger("MultiThreadServer");
Client
import java.io.*;
import java.net.*;
import java.nio.*;
import java.nio.channels.*;
import java.nio.charset.*;
import java.util.Arrays;
import java.util.Iterator;
import java.util.Set;
import java.util.logging.Level;
import java.util.logging.Logger;
public class MultiClient implements Runnable
public static void main(String[] args) throws Exception
if (args.length < 1)
System.out.println("usage: java MultiClient number [host]");
System.exit(1);
int number = Integer.parseInt(args[0]);
String host = (args.length == 2) ? args[1] : "localhost" ;
Thread[] threads = new Thread [number];
InetSocketAddress address = new InetSocketAddress(host, 80);
for (int i = 0; i < number; i++)
MultiClient client = new MultiClient(address, Integer.toString(i));
threads[i] = new Thread(client);
threads.setDaemon(true);
for (int i = 0; i < number; i++)
threads[i].start();
for (int i = 0; i < number; i++)
threads[i].join();
public MultiClient(InetSocketAddress address, String id)
throws InterruptedException, IOException
m_id = id;
Charset charset = Charset.forName("UTF-8");
m_decoder = charset.newDecoder();
m_encoder = charset.newEncoder();
m_channel = SocketChannel.open();
m_channel.connect(address);
if (id.equals("0"))
Socket socket = m_channel.socket();
System.out.println("SO_SNDBUF=" + socket.getSendBufferSize()
+ ",SO_TIMEOUT=" + socket.getSoTimeout()
+ ",SO_KEEPALIVE=" + socket.getKeepAlive());
byte[] buf = new byte [1024]; // bufsize = 1K
Arrays.fill(buf, (byte) m_id.charAt(0));
m_buffer = ByteBuffer.allocateDirect(1024);
m_buffer.put(buf);
m_buffer.flip();
Thread.currentThread().sleep(50L);
public void run()
System.out.print(m_id);
try
while (true)
m_channel.write(m_buffer);
m_buffer.rewind();
Thread.currentThread().sleep(1000L);
catch (IOException ioe)
ioe.printStackTrace();
catch (InterruptedException ie)
System.err.println(ie.toString());
private String m_id;
private CharsetEncoder m_encoder;
private CharsetDecoder m_decoder;
private SocketChannel m_channel;
private ByteBuffer m_buffer;

{This is a crosspost. I posted this earlier today at http://forum.java.sun.com/thread.jsp?forum=4&thread=319822 before I stumbled on a search phrase that located this older thread.
All follow-ups should be on haam's thread instead of mine. The important point below is that NIO select() behavior (vs. threading IO) is [b]worse under Windows but better under Solaris. This seems fundamentally broken. }
My company sells a scalable multi-user server platform built on Java 2.
It runs under Java 1.3.1 (and 1.4.0 windows) using multiple threads for communications, and 1.4.x (unix) using NIO. We were happy to see that 1.4.1 (windows) fixed the problem drastically limiting the number of selectable ports. :-)
The bad news is that whatever the VM is doing "under the sheets" to fix the problem seems to perform very poorly in terms of CPU:
I compared connecting 500 simulated users to a Solaris 8 and a Win2K box. These users were in 25 chat rooms, each sending a chat message every 30 seconds. (There was plenty of memory on each machine. There was no swapping in either case. Clock/CPU type doesn't matter as this isn't about comparing a machine to a machine, but different load characteristics -within- a machine environment.)
                Threaded IO      NIO/Select
Solaris 1.4.1     20-30%           15- 20%
Windows 1.4.1     40-50%           95-100%Numbers are % of CPU as reported by 'top' and the Win2K task manager.
Both platforms showed the expected significant improvement in memory usage when moving from standard threaded IO to NIO.
Strangely, the Windows implementation of the VM showed a significant (and unexpected) degradation of NIO performance vs. the threaded model, whereas the Solaris VM behaved as expected: NIO outperformed threaded IO.
Our best guess is that the Selector fix in 1.4.1 is implemented in some cpu-intensive way; perhaps polling. As a result, we still can't use NIO for Wintel boxes running our server. :-( To us, Selector
is still broken.
Has anyone else seen results like this? Have we missed some configuration parameter that will fix this?
I thought the big upside of using select() was supposed to be performance. :-P
F. Randall Farmer
State Software, Inc.
http://www.statesoftware.com

What is the status of fixing nio Buffer performance?

The performance of nio Buffers is very slow in comparison to primitive arrays. Does anyone know the status for getting this fixed? For performance sensitive code, Buffers are currently unacceptable.
Several Bugs were open regarding this issue, but they have all mysteriously been closed without resolution?
(Bug ID: 4411600, http://developer.java.sun.com/developer/bugParade/bugs/4411600.html)
Any insight would be appreciated.

Here are numbers using the code from the Bug reference I posted above.
(Bug ID: 4411600, http://developer.java.sun.com/developer/bugParade/bugs/4411600.html)
The only change I made was to the main() function so I could run tests multiple times to see if Hotspot would kick in at some point.
Here is the change I made. Replaced the original main() function with the following:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
public static void main(String[] args) {
for ( int x = 0; x < 5; x++ )
System.gc();
System.out.println();
System.out.println( ">>>> run #" + x );
run();
public static void run()
timeArray();
time("heap", ByteBuffer.allocate(SIZE));
time("direct", ByteBuffer.allocateDirect(SIZE));
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Btw, I am running this under Win2k.
The results follow.
D:\dev\parse>java -server Bench3
run #0array 30
array rev 80
heap 51
heap rev 90
direct 601
direct rev 401
run #1array 40
array rev 40
heap 571
heap rev 661
direct 671
direct rev 290
run #2array 41
array rev 40
heap 550
heap rev 671
direct 591
direct rev 291
run #3array 40
array rev 50
heap 551
heap rev 661
direct 590
direct rev 300
run #4array 50
array rev 51
heap 561
heap rev 651
direct 581
direct rev 291
D:\dev\parse>java -client Bench3
run #0array 181
array rev 190
heap 380
heap rev 390
direct 791
direct rev 701
run #1array 180
array rev 180
heap 861
heap rev 802
direct 731
direct rev 711
run #2array 180
array rev 180
heap 1202
heap rev 721
direct 721
direct rev 971
run #3array 200
array rev 180
heap 741
heap rev 731
direct 861
direct rev 821
run #4array 180
array rev 210
heap 911
heap rev 721
direct 741
direct rev 741
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
But after taking a closer look at the code I realized that both the heap and direct buffers were being run through the same sum() and reverse() methods, so maybe Hotspot's inlining policy avoids keeping more than one specialized version of a function cached?
So I re-ran the benchmark, but only running time("heap"...) and time("direct",...) one at a time. I will not bother posting the numbers, but this time array, heap, and direct were all roughly equivalent.
So depending on the scenario, maybe it is not as bad as I though?
To try another variant, I replaced all calls to bb.get(j) with bb.get(), thinking this may be an opportunity for Buffer to be faster than array, since no start-of-buffer bounds check should be necessary (can only iterate forward).
But even running heap and direct separately, the test yielded slow numbers that suggest no inlining was performed again?
So I'm not sure what to think? I would like to use Buffers over primitive arrays, but I have no certainty of the performance characteristics of Buffers? They may run fast, or they may run very slow, depending on the scenario and which functions are used.
Here are the numbers after replacing calls bb.get(j) with bb.get().
D:\dev\parse>java -server Bench3
run #0array 41
array rev 80
heap 271
heap rev 381
run #1array 40
array rev 50
heap 231
heap rev 321
run #2array 40
array rev 40
heap 241
heap rev 321
run #3array 40
array rev 50
heap 231
heap rev 321
run #4array 40
array rev 40
heap 231
heap rev 320

Measuring the Performance of NIO Selectors ?

Hi,
I am trying to build a java messaging API using the java.nio package. This is similar to MPI, (which is in C/Fortran), so obviously, i need to make a comparison of this in terms of performance (how fast my library can transfer your messages).
Anyway, during a simple point-to-point communication, meaning that one process is sending a message, and the other one is just receiving it, i get latecncy of around 40 milli seconds, which is really unacceptable. What i have tried to understand through painful debugging and analysis of my program is, that when i try to write some thing to other node, then i copy my message onto the buffer, and then i wakep up the selector so that it may write whatever i have copied onto the buffer. my send method takes nearly 40 milliseconds, and for 38 milliseconds(out of total 40 milli seconds), i wait for selector to wakep up. Even if it wakesup, its not ready to fire a write event, coz may be teh channel is not ready or any thing else. So my question is, how can i control this behaviour of the selector, how can i make the channel to be writeable faster than this, i can't afford for the selector to be not ready to write for 38 milliseconds. Its very very slow. Can any one throw some light on this please ...
Thanks
--Aamir

Thanks for your replies. So let me get into a little bit detail now.
You have suggested that one should register for OP_WRITE only when you get short write only. Right, i do get short writes, and i understand what do you mean by this. So now, please leave aside short writes, let's talk about write or no-write situation.
There are two parts of the parts as i assume every NIO program has. One which is interface to users, like send() or recv(), and second part is the selector itself. So with regards to OP_WRITE, if i add OP_WRITE in the interestOps() during the start-up, then i have pay for cent percent CPU usage. I posted some problem like this on the forums and you guyz suggested, add OP_WRITE to interestOps() only when you need to write somne thing. IN my case wehn some one has called send(), because someone has called the send(), offcourse only then, it makes sense to have an OP_WRITE event in the seelctor. Otherwise, it just loops and loops and takes cent percent CPU. So on the other side, if i dont add OP_WRITE in the interestOps() intially, which i am doing right now, then i add it when some one calls send(), because as i said, only then and then i am interested in having OP_WRITE event in the selector. So in the send method, i do this ....
    SelectionKey key = tempChannel.keyFor(controlSelector);
    key.interestOps(SelectionKey.OP_WRITE);
    key.selector().wakeup();to wake up the selector so that i may write. But the time taken in transition from this code to OP_WRITE event code is almost 39 milliseconds (out of total 40 milliseconds) of my send(), so clearly, i am missing some thing, and once i am clear about that, i think it can go down to 1 milliseconds (and its as fastest) as java can go.
Actuallly, let me explain the problem in a little bit more detail, becoz its very interesting for myself, I dont know whats going wrong with it but anyway,
My send and recv methods are like this ,
Send Method _________________|_________________Recv Method
Step 1: User called Send() Step 1__|__Step 1: Usercalled Recv() step 1
Step 2: Control Selector writes a___|__ Step 2: Control Selector reads teh control message.
control message tellling the length_|
and ID of message |
Step 3: Expects a reply for its ctrl __|__ Step 3: Gives an OK to the sender to send the actual data.
message from the receiver. |
Step 4: Sends teh actual data_____|__Step 4: Receives teh actual data
This must be giving an idea of hand-shaking i am donig actually before sending teh actual data, and this handshaking is done by separateSelector and actual transmission is done by other selector. So this was about my application, but now here's what i am trying to do.
Ping Pong Test
NODE 1_________________________________NODE2
First Part :
Send() --------------------------\
-------------------------------------- \-------------------------> Recv() //whatever it receiveed, send it back
Second Part:
--------------------------------------- /---------------------------> Send()
Recv() ----------------------------/
So you may wel imagine that there's alot of waking up the selector and every thing going on here, and now lets see the timings.
NODE1________________________________NODE2
Send() ________________________________Recv()
First Part:
Sender Step 1( 0 milliseonds)------------------------ Recv Step 1 ( 0 milliseonds)
Sender Step 2( 0 milliseonds)------------------------ Recv Step 2 ( 0 milliseonds)
Sender Step 3( 0 milliseonds)------------------------ Recv Step 3 ( 0 milliseonds)
Sender Step 4( 0 milliseonds)------------------------ Recv Step 4 ( 0 milliseonds)
Recv() ________________________________ Send()
Second Part
Recv() Step 1( 0 milliseonds)------------------------ Sender Step 1
(Problematic bit) ----Here on the send(), the transition time between step 1 adn step 2 is teh whole time, like the whole 40 milliseonds, and in step 1 i wakeup the selector to write and in step 2, i just write. So thats the whole problem.
Recv() Step 2( 40 milliseonds)----------------------- Sender Step 2( 40 milliseonds)
Recv() Step 3( 0 milliseonds)-------------------------Sender Step 3( 0 milliseonds)
Recv() Step 4( 0 milliseonds)-------------------------Sender Step 4 ( 0 milliseonds)
Interestingly, if i have some thing like Barrier (snychronization point) with my ping pong test, like
NODE 1______________________________ NODE2
Send() ---------------------------------\
--------------------------------------------\--------------------------------Recv()
Barrier()------------------------------------------------------------------Barrier()
--------------------------------------------/---------------------------------Send()
Recv() ---------------------------------/
Barrier()------------------------------------------------------------------Barrier()
In a scenario like this where i have some time, like you may say sleeping time among send and recv methods, then every thing is fine.
I dont expect you all to understand, but if by any chance some one could gets a clue about what could be going wrong, then please do comment.
Sorry for a long post had no other optoin actually.
Thanks in advance
--Aamir

Java Disk NIO vs BDB Performance

Hi guys,
Does anyone have any stats or comparison studies of which performs better: disk NIO or BDB?
Thanks very much.

Moving minUtilization up will lower performance, but the amount depends on your write rate and many other factors. You will not find numbers on this, only a qualitative answer. You will need to test it.

Performance of java nio with dd in linux.

Hi
I ran this code in java to dump 0s in a file of size 1gb and tried the same with dd in linux.
Java code : I use a preinitialized array (the data) and fill a mappedbytebuffer with it.
package filePersistence.test;
import java.io.File;
import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
public class TimeLineTest {
     private static final int BYTE_LENGTH = 1000;
     //this is the input
     static byte[] b = new byte[BYTE_LENGTH];
     public static void main(String[] arg) throws Exception
          //initializing the input
          for (int i=0;i<BYTE_LENGTH;i++)
               b[i] = 0;
          File file = new File(arg[0]);
          RandomAccessFile raFile = new RandomAccessFile(file,"rw");
          FileChannel fChannel = raFile.getChannel();
          int loopCount = 1000000;
          MappedByteBuffer mbuffer = fChannel.map( FileChannel.MapMode.READ_WRITE ,0,loopCount * BYTE_LENGTH);
          //System.out.println(" going to fill in a file" + file.getName());
          long k = 0;
          long startTime = System.currentTimeMillis();
          for (int i=0;i<loopCount;i++)
     //populate the mapped buffer
               mbuffer.put(b);
          //persist into the file
          mbuffer.force();
          long endTime = System.currentTimeMillis();
          System.out.println(" file filled size1 "+file.length());
          System.out.println(" time " + (endTime - startTime));
On a linux machine this takes around 7 secs while dd used as
"dd if=/dev/zero of=mytestfile.out bs=1000 count=1000000"
1000000+0 records in
1000000+0 records out
1000000000 bytes (1.0 GB) copied, 4.618 seconds, 217 MB/s
4.6 and 7 differ quite a lot. Is there a way the java code can be improved to match dd ?(-sever does not help)
Thanks
Sumanta

Hi
Can this be called a dd equivalent code ?
package filePersistence.test;
import java.io.File;
import java.io.FileOutputStream;
import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
public class TimeLineTest {
     private static final int BYTE_LENGTH = 1000;
     //this is the input
     static byte[] b = new byte[BYTE_LENGTH];
     public static void main(String[] arg) throws Exception
          //initializing the input
          for (int i=0;i<BYTE_LENGTH;i++)
               b[i] = 0;
          String srcFile = arg[0] + "_src";
          FileOutputStream fos = new FileOutputStream(srcFile);
          fos.write(b);
          fos.getFD().sync();
          fos.flush();
          fos.close();
          RandomAccessFile srcraFile = new RandomAccessFile(srcFile,"rw");
          FileChannel srcChannel = srcraFile.getChannel();
          File file = new File(arg[0]);
          RandomAccessFile raFile = new RandomAccessFile(file,"rw");
          FileChannel raChannel = raFile.getChannel();
          int loopCount = 1000000;
          //MappedByteBuffer mbuffer = fChannel.map( FileChannel.MapMode.READ_WRITE ,0,loopCount * BYTE_LENGTH);
          //System.out.println(" going to fill in a file" + file.getName());
          long startTime = System.currentTimeMillis();
          for (int i=0,position=0;i<loopCount;i++,position+=BYTE_LENGTH)
     //populate the mapped buffer
               //mbuffer.put(b);
               raChannel.position(position);
               srcChannel.transferTo(0, BYTE_LENGTH, raChannel);
          //persist into the file
          //mbuffer.force();
          raFile.getFD().sync();
          raChannel.close();
          raFile.close();
          long endTime = System.currentTimeMillis();
          System.out.println(" file filled size1 "+file.length());
          System.out.println(" time " + (endTime - startTime));
}

Looking for performance boost to java.nio.charset.encode(String) method

hi ,
i have run jprofiler on my application and so that this method takes 4% of the overall CPU time (!) .
is there any other , faster way to do this encoding ?
thanks!

yjavaman wrote:
i susspect that in my application (multi threaded ) , it is actually a larger bottle neck than i think right now.
i ask about maybe the caching of the encoder , decoder , maybe i can save this time by doing other stuff .Not sure if you are already buffereing, but from javadoc:
java.io.OutputStreamWriter
Each invocation of a write() method causes the encoding converter to be invoked on the given character(s). The resulting bytes are accumulated in a buffer before being written to the underlying output stream. The size of this buffer may be specified, but by default it is large enough for most purposes. Note that the characters passed to the write() methods are not buffered.
For top efficiency, consider wrapping an OutputStreamWriter within a BufferedWriter so as to avoid frequent converter invocations. For example:
Writer out = new BufferedWriter(new OutputStreamWriter(System.out));
Just something simple to use.

Java NIO locking and NTFS network resources

Hi all - just ran into a really nasty situation and I was wondering if anyone else has hit it and might have some suggestions.
Platform: JRE 1.4_02 on a Win XP machine
The following test code locks a file, then copies it to another location using NIO.
When I run it with source path on my local drives (C), it works fine. If I run it with source path on a network shared resource, it fails with an IOException with description 'Error performing inpage operation'.
If I disable the lock immediately before the copy operation, it works fine.
My conclusion is that there is something about the NIO locking implementation that prevents it from working properly with NTFS volumes on other hosts. Can this be right? I've found the following bug report:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4774175
but this seems like a huge problem that would prevent folks from using NIO in many, many applications. Maybe I'm wrong on something here...
Anyway, here's the test code:
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.channels.FileChannel;
import java.nio.channels.FileLock;
* Created on May 28, 2004
* (c) 2004 Trumpet, Inc.
* @author kevin
public class test {
     private void createFile(File f) throws IOException{
          FileOutputStream os = new FileOutputStream(f);
          for(int i = 0; i < 10; i++){
               os.write(i);
          os.close();
     public test() {
          boolean testWithReleasingLockPriorToCopy = false;
          final File f1= new File("w:/temp/test2.lok");
          final File f2 = new File("w:/temp/test.lok");
          f1.delete();
          f2.delete();
          try {
               createFile(f1);
               RandomAccessFile raf1 = new RandomAccessFile(f1, "rw");
               RandomAccessFile raf2 = new RandomAccessFile(f1, "rw");
               FileChannel ch1 = raf1.getChannel();
               FileChannel ch2 = raf2.getChannel();
               FileLock flock1 = ch1.lock();
              if (!f2.getParentFile().exists() && !f2.getParentFile().mkdirs())
                   throw new IOException("Unable to create directories for destination file '" + f2 + "'");
              if (testWithReleasingLockPriorToCopy)
                   flock1.release();
               ch1.transferTo(0, raf1.length(), ch2);
               raf1.close();
               raf2.close();
          } catch (Exception e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
     public static void main(String[] args) {
          test t = new test();
}Does anyone have any pointers here? I need to be able to exclusively lock a file on a network drive (preventing any other applications from opening it), then make a copy of it. I can't use regular stream operations, because the lock prevents them from working properly (it appears that, once you grab a file lock using NIO, the only way your application can use the file is via the NIO operations - using stream operations fails...).
Thanks in advance for any help!
- Kevin

i've run into the same problem recently, channels working fine for local file locking, but when you turn to the network, they fail to accurately handle locks.
i ended up writing a jni utility to ship with my java application that locks files using native windows calls.
my .c file ends up looking something like this:
JNIEXPORT jint JNICALL Java_Mapper_NativeUtils_LockFile
(JNIEnv *env, jobject obj, jstring filename)
const char* ntvFilename = (*env)->GetStringUTFChars(env, filename, 0);
int retVal = (int)CreateFile
ntvFilename
, GENERIC_WRITE
, FILE_SHARE_READ
, 0
, OPEN_EXISTING
, FILE_FLAG_SEQUENTIAL_SCAN
, 0
//add code to throw java exceptions based on retVal
if (retVal == (int)INVALID_HANDLE_VALUE)
return retVal;
(*env)->ReleaseStringUTFChars(env, filename, ntvFilename);
return retVal;
JNIEXPORT jboolean JNICALL Java_Mapper_NativeUtils_UnlockFile
(JNIEnv *env, jobject obj, jint handle)
     CloseHandle((void *)handle);
return 1;
it's a little shy on the error checking side, but it provides support for network file locking that java seems to lack.

O.S. / Hard Drive Size for NIO Server/Client's load testing...

Hi All
I am currently load testing a NIO Server/Client's to see what would be the maximum number of connections that could reached, using the following PC: P4, 3GHz, 1GB RAM, Windows XP, SP2, 110 GB Hard Drive.
However, I would like to test the Server/Client performance on different OS's:
Which would be the best possible option from the following:
1. Partition my current drive, (using e.g. Partition Magic), to e.g.
- Win XP: 90 GB
- Win Server 2000: 10 GB
- Linux: 5 GB
- Shared/Data: 5 GB
2. Install a separate Hard drive with the different hard drives
3. Use a disk caddie, to swap in/out test hard drives.
4. Any thing else?
- Would the Operating System's hard drive size affect the Server/Client's performance, e.g. affecting the number of connections, number of File Handles, the virtual memory, etc.?
Many Thanks,
Matt

You can use a partition on the same HDD or use a second HDD, disk caddie well if its a direct IDE or SCSI. If its usb no it will be too slow, may be if you have a fire-wire but I still don't recommend it.
Be careful if you don't have any experience installing Linux you may do multiple partitions on you disk without knowing, because Linux ext partitions are not visible to windows.
Recommended disk size for fedora is 10 GB. This is the amount of data that will be created on you HDD when you do a full installation.

Help to boost the performance of my proxy server

Out of my personal interest, I am developing a proxy server in java for enterprises.
I've made the design as such the user's request would be given to the server through the proxy software and the response would hit the user's browsers through the proxy server.
User - > Proxy software - > Server
Server -> Proxy software -> User
I've designed the software in java and it is working
fine with HTTP and HTTPS requests.The problem which i am so scared is,
for each user request i am creating a thread to serve. So concurrently if 10000 users access the proxy server in same time,
I fear my proxy server would be bloated by consuming all the resources in the machine where the proxy software is installed.This is because,i'm using threads for serving the request and response.
Is there any alternative solution for this in java?
Somebody insisted me to use Java NIO.I'm confused.I need a solution
for making my proxy server out of performance issue.I want my
proxy server would be the first proxy server which is entirely
written in java and having a good performace which suits well for
even large organisations(Like sun java web proxy server which has been written in C).
How could i boost the performace?.I want the users should have no expereience of accessing the remote server through proxy.It would be like accessing the web server without a proxy for them.There should be not performance lagging.As fast as 'C Language'.I need to do this in java.Please help.

I think having a thread per request is fine.Maybe I got it wrong, but I thought the point in
using NIO with sockets was to get rid of the 1 thread
per request combo?Correct. A server which has one thread per client doesn't scale well.
Kaj

Thread blocking on java.nio.charset.CoderResult

Hello all,
I have a multi-threaded app which does some fairly intestive string operations (basically extracts text from documents for indexing a search system).
I am seeing a massive bottleneck around the java.nio.charset.CoderResult class. When profiling, I see a whole stack of threads blocking on (waiting for) a monitor on this java.nio.charset.CoderResult class. Seems to be a result of string encoding/decoding (I am often encoding strings as UTF-8).
Anyone know why the JVM would want my threads to sync on this class? It's creating a huge performance issue for my app. Approximately 15% of ALL the processing time is spent waiting for this class.
Help!

I would guess that you're using some of the static methods in the CoderResult class. The static methods CoderResult.unmappableCache(), CoderResult.malformedForLength() and CoderResult.malformedCache all use a static inner class called Cache. Its get() method is synchronized on Cache.class. Since the Cache inner-class is static, any part of your multi-threaded application that goes through the Cache.get() method is going to be waiting for the lock on Cache.class.
Could you create a CoderResult instance for each thread? That would mean that there would be a different static Cache class for each thread, reducing the number of threads competing for the Cache.class lock.
I'd have to see some of your code to give a better answer.
Brian

NIO Performance

Similar Messages

Maybe you are looking for