Long running/Blocking methods

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Long running/Blocking methods

kramer
Hi,

I'm having a hard time wrapping my head around blocking jobs in a non blocking environment and here is my 2 questions on them:

1. What is the difference between these three ways of calling the blocking method "thirdPartyHttpGet" :

1.a
blocking { thirdPartyHttpGet() } then { render httpResponse }

1.b
observe({ blocking { thirdPartyHttpGet() } }).subscribe({ render httpResponse })

1.c
ExecutorService executor = new ThreadPoolExecutor(4, 4, 1, TimeUnit.MINUTES, new LinkedBlockingQueue<Runnable>()); 

Observable.create({ observer -> 
            executor.execute({
                    response = thirdPartyHttpGet();
                    observer.onNext(response);
                    observer.onCompleted();
            })
        }).subscribe({render httpResponse})

1.d I think this can be also done with RXJava schedulers but I don't have example yet :)


2. Considering the above code samples we are creating a new thread to execute blocking section, right? Then how come this is different than the old threaded model? We are still creating a new thread for each request (in the old model server did it, now we are doing it), and we can still exhaust the ThreadPoolExecutor... What am I missing here?
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator
All your 1[a-c] all achieve the same thing.

A key difference between [a-b] and [c] is that they are using Ratpack’s caching thread pool to perform the blocking operation, and participating in Ratpack’s logical execution model (see http://www.ratpack.io/manual/current/api/ratpack/exec/Execution.html). This preferable to using your own set of threads for blocking ops.

The difference between [a] and [b] is that RxJava offers a suite of composable functions for constructing complex data flows with async apis (which is what blocking {} is). Using raw Ratpack promises is fine for simple processing, but if it’s complex then you are going to get into nested callbacks which is never good. RxJava solves this, which is its whole reason for existing.

As for #2 you are right, to a point. Blocking is not without cost and should be avoided where possible, but that’s not always possible.

Threads are reused, so it isn’t always creating a new thread per blocking operation (a J2EE container will reuse threads as well). 

It comes down to fewer threads are competing for CPU time. In thread-per-connection with 100 threads, you could potentially have 100 threads competing for compute time. In non blocking, you have a small amount of threads competing for compute time (typically num cores * 2). By pushing blocking ops on to a separate thread pool, you can give back the compute thread for other compute work. The JVM scheduler can detect when a thread is blocked on IO or a lock and not schedule it. Granted, there is some compute time needed to move the data from the blocking thread back on to the event loop but it’s still better than 100 thread competing for CPU time.

For best performance, you have to use non blocking APIs (BTW, Ratpack’s HTTP client is non blocking). Most real applications are going to be forced to use a mix of blocking and non blocking though because non blocking is not always available (e.g. JDBC).

If you don’t care about performance and want simplicity, you can always bump the number of threads in Ratpack’s event loop. Then you can just block in the event loop.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
This post was updated on .
Thank you very much Luke, this clarifies a lot for me but not without raising some new questions

When you say "It comes down to fewer threads are competing for CPU time." I can see the reason in case one is doing IO bound ops but what if all you are doing is compute-heavy... Thinking of this as the worst case scenario, if I need to call a 5-sec-computation for each request, then all threads will be busy and not waiting. And If I have a 10-thread pool, doesn't this essentially mean I can only have 10 burst requests (I know we are not blocking the http handler part, so it can take requests, but the pool will stop accepting new tasks at some point I guess, otherwise it's a never ending pile up on the pool)? So is it correct to say in this scenario there is no advantage over old threaded blocking model?

Another thing is that I want to hide some of the blocking method chains and/or observables in my service layer (away from ratpack.groovy) and I'm wondering what would be the preferred way to perform "blocking"s outside ratpack DSL? e.g.: pass the context from DSL to service layer...
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator
If you do have such heavy computation to do, the advantage of this model is that incoming connections will timeout due to not being accepted quick enough vs just getting responded to very slowly and probably eventually making the server fall over because it was never resourced to support all threads maxed out. This is one of the problems with the thread-per-connection model; it scales unpredictably. In the thread per core model, things are more predictable because the external queue (in this case the TCP stack) will just not feed in to the system. It’s generally better that some clients get denied than bringing the system to its knees.

In either case, you should not be doing such heavy computation during request processing. You should be queueing it for another background job system that can inform when done (i.e. process it asynchronously). 

You don’t want to pass the Context into the service layer. You want to use http://www.ratpack.io/manual/current/api/ratpack/exec/ExecControl.html

If you’re using the Guice module, you can inject an instance of this into your services (it’s a singleton effectively).

If you are wiring things up manually, get an instance via the LaunchConfig given to your HandlerFactory: launchConfig.execController.control.
 
Once you start doing this, you’ll very quickly want to use RxJava and return an Observable everywhere.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
One more question: how would you test an ExecControl injected class?

I've tried manually injecting the ExecControl obtained by below code, to my class but I keep getting the error: "ratpack.exec.ExecutionException: No execution is bound to the current thread (are you calling this from a blocking operation or a manual created thread?)"
new ClosureBackedEmbeddedApplication().server.launchConfig.execController.control
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator


You'll need to provide more info on what you are trying to do.

As the exception says, you're trying to use the exec control from an unmanaged thread.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
This post was updated on .
In reply to this post by kramer
I am trying to test this class:

class X {

    def generators = [] // filled with biz logic closures which return Iterable<String>
    ExecControl execControl

    public X(ExecControl execControl) {
        this.execControl = execControl
    }

    rx.Observable<String> doWork(y) {
        rx.Observable.merge(generators.collect({ generator ->
            observeEach(execControl.blocking { generator(y) })
        }))
    }
}

and this is what I've tried:

class XTest {
    @Test
    void testBasicFlow() {
        def app = new ClosureBackedEmbeddedApplication()
        app.bindings {
            init {
                RxRatpack.initialize()
            }
        }
        app.server.start()

        X x = new X(
                app.server.launchConfig.execController.control
        )

        x.generate(sampleY).subscribe {
            println it
        }
    }
}
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

uris77
I haven't tried Rx with Ratpack yet, so this just me thinking out loud.

I would Mock execControl, and just verify that it is calling blocking(). Since it takes a closure, I would add that to another class, and would do finer grained unit test on that class for the business logic. My reasoning is that since ExeControl is specific to Ratpack, I would want to isolate from application specific code.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
Roberto, I'm already unit testing other parts of the code, I just want to do an integration test where I can validate that RxJava & blocking code sections actually play out as intended :) .
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

uris77
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
Yup it indeed does help in cases where you want to bring up the whole Ratpack application - which would be a functional test I assume (and I'm already using it ); but in this particular case I want to test just this class but still have the execution controller integration from Ratpack... this is the part where I'm stuck.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator
Here’s something that should help: https://gist.github.com/alkemist/1f45f1b95eb83286ea08

We can definitely make this easier and remove some of the boilerplate. Just haven’t gotten to it yet.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
Awesome, that did the trick!

Thanks a heap Luke & Roberto.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
I have a weird usage pattern that I tried to recreate (a simplified version) in this gist: https://gist.github.com/kramer/4fea054842bd9f76441e 

In this example
 [1] works for the few hundred calls (that multi makes to single) and then starts getting connection timeouts and then eventually connection refused errors. I guess this "connection refused" part was the predictable
 [2] works without errors
failure part you mentioned, right?
 [3] only prints operation timed out exceptions
 [4] is hacky but i wanted to see what happens. mostly works but if you call it too many times it starts giving the timeout errors too.

Any ideas why only [4] works ok and not the others?
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator
 I’m not sure about [1]. For me it just times out. It looks like a deadlock in Rx. 

I would indeed expect [2] to work.  

[3] is a _really_ bad idea :) - you’re flooding the event loop with blocking operations. Requests can’t be received because all the threads are blocked, and because to unblock them requests need to be processed you are effectively in deadlock.  

[4] should work (and is more correct than 3). I suspect the timeouts are just due to 5s not being long enough under the load.  

A _much_ better idea is to not block here. I’ve forked the gist to use async client in a non blocking manner, and also added usage of Ratpack’s own http client.  

https://gist.github.com/alkemist/e5cdb22d9cf14118932c 

Both of these perform much better for me, and don’t flatline the cpu as the blocking variants do.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
This post was updated on .
Thanks for the information. I've tried the ratpack's http client in earlier versions but had some problems and replaced it with ning (I probably was doing some wrong scheduling again), I'll try it again.

Is there a proper way to get scheduler of blockingExecutor (because in the example I've used groovy's private accessing capability)?
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator
It’s private, but I’ll open it up.
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

kramer
One more thing: how do I set proxy on ratpack's http client?
Reply | Threaded
Open this post in threaded view
|

Re: Long running/Blocking methods

Luke Daley
Administrator
It doesn’t support proxies yet.
12