The task is as follows: upon arrival of the track (there may be thousands of them), it is necessary to request data from the scrobbler, process the response (in the example not shown for the sake of clarity), and write to the file. I decided to use Tpl.Dataflow and this is what happened:

 static void Main() { HttpClient hc = new HttpClient(); StreamWriter sw = new StreamWriter(@"C:\res.txt"); // первый вариант TransformBlock<string, string> tb = new TransformBlock<string, string>(item => hc.GetStringAsync(item), new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4, BoundedCapacity = 200 }); //второй вариант //TransformBlock<string, string> tb = new TransformBlock<string, string>(item => new HttpClient().GetStringAsync(item), new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4, BoundedCapacity = 200 }); ActionBlock<string> ab = new ActionBlock<string>(item => { sw.WriteLine(item); sw.WriteLine("________________________________________________"); }, new ExecutionDataflowBlockOptions { BoundedCapacity = 200 }); tb.LinkTo(ab); tb.Completion.ContinueWith(item => ab.Complete()); ab.Completion.ContinueWith(item => sw.Dispose()); Stopwatch swa = new Stopwatch(); swa.Start(); foreach (var item in urls) { tb.Post(item); } tb.Complete(); ab.Completion.Wait(); Console.WriteLine(swa.ElapsedMilliseconds); } 

As you probably already noticed, there are two branches of the solution:

  1. In the first variant, I use the same HttpClient object, but it can simultaneously send only two requests (by the way, why is that?), HttpClient why the whole process takes quite a while option 1

  2. The second option is already faster, and much faster, if you parallelize not on 4, but on 20, for example, but also not without drawbacks: the creation of each HttpClient object entails setting up a connection (handshake), which is logical. But as I understand it, this is an extra overhead that can be avoided. What the question is about for me:

is it possible to somehow bind one HttpClient object to a task so that when it executes one request, the second request occurs through the same HttpClient object and the connection is not established. That is, I want the HttpClient objects HttpClient be as much as MaxDegreeOfParallelism and each task to use an HttpClient object that is not occupied by other tasks at the moment. Well, or another effective solution

Thank you in advance

    2 answers 2

    Create a pool of HttpClient objects. Before the request, remove the object from the pool (or create a new one), and after the request, return it to the pool.

     ConcurrentBag<HttpClient> pool = new ConcurrentBag<HttpClient>(); TransformBlock<string, string> tb = new TransformBlock<string, string>(async item => { HttpClient hc; if(!pool.TryTake(out hc)) { hc=new HttpClient(); } try { return await hc.GetStringAsync(item).ConfigureAwait(false); } finally { pool.Add(hc); } }, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4, BoundedCapacity = 200 }); 
    • one
      Why specifically in this example ConfigureAwait(false) ? - Qutrix
    • one
      @Qutrix The code after await ( pool.Add(hc) ) is not tied to the synchronization context, so it makes no sense to capture and synchronize the synchronization context, as it happens by default. - PetSerAl

    Create four TransformBlock instead of one. And each give your HttpClient.

    Use another block as source.

    • Firstly, I have already parallelized them (MaxDegreeOfParallelism = 4), secondly, it’s more difficult to connect, it would seem, thirdly, 4 for the sake of example, in fact it’s good to be at least 30, fourthly, I need to maintain order, but in this case it is not guaranteed - Qutrix
    • @Qutrix what's the difference - 4 or 30? The order is more complicated .. - Pavel Mayorov
    • have not yet encountered the case of "one source - many targets", to which target will the message be sent? everybody? alone? if one, then which one? - Qutrix
    • @Qutrix one. To some unknown (but certainly not to the one that is filled). - Pavel Mayorov