Hi, I'm not quite sure what 'feature' i'm looking for ... any input appreciated. I want to parallelize a particular task. #/usr/bin/perl -w use strict; my @target; our @result; for (my $i = 0; $i < @target; $i++) { $result[$i] = &do_some_work($target[$i]); } &report_results; .... &do_some_work requires a minute or so to complete. @target contains several hundred elements. Therefore, total execution time runs in the hundreds of minutes. Also, @target is not ordered ... e.g. there are no dependencies within @target ... if &do_some_work finishes processing $target[159] before it starts (or finishes) $target[17], no problems. I figure that if i could find a way to spawn lots of copies of &do_some_work ... that i could reduce total execution time. Assuming that my machine has sufficient resources, I might even get total execution time down to a minute or so. This would be a major win for me -- I would like this app to complete within ten minutes at the outside. What Perl 'feature' should I explore to do this? Am I walking into 'threads' here? --sk Stuart Kendrick FHCRC
![]() |
0 |
![]() |
skendric@fhcrc.org (Stuart Kendrick) writes: > I'm not quite sure what 'feature' i'm looking for ... any input > appreciated. > > I want to parallelize a particular task. > > #/usr/bin/perl -w > use strict; You should probably prefer 'use warnings;' to the -w flag these days. I still use -w, but it's mostly finger macros I haven't retrained yet. > my @target; > our @result; > > for (my $i = 0; $i < @target; $i++) { > $result[$i] = &do_some_work($target[$i]); > } > &report_results; > ... Ack, don't *do* that. Specifically, don't call subs with &. See perlfaq7, "What's the difference between calling a function as &foo and foo()?" You can probably get away with just fork()ing inside do_some_work() (note lack of '&'). 'perldoc -f fork' should give you the skinny. See also perlipc for a slightly broader view. -=Eric -- Come to think of it, there are already a million monkeys on a million typewriters, and Usenet is NOTHING like Shakespeare. -- Blair Houghton.
![]() |
0 |
![]() |
skendric@fhcrc.org (Stuart Kendrick) wrote: > I want to parallelize a particular task. > > #/usr/bin/perl -w > use strict; > my @target; > our @result; Why 'our'? > for (my $i = 0; $i < @target; $i++) { for my $i (0..$#target) { or, better, push $result, do_some_work($_) for @target; > $result[$i] = &do_some_work($target[$i]); Don't call subs with &. > } > &report_results; > ... > > &do_some_work requires a minute or so to complete. @target contains > several hundred elements. Therefore, total execution time runs in the > hundreds of minutes. > > Also, @target is not ordered ... e.g. there are no dependencies within > @target ... if &do_some_work finishes processing $target[159] before > it starts (or finishes) $target[17], no problems. > > I figure that if i could find a way to spawn lots of copies of > &do_some_work ... that i could reduce total execution time. This will only help if either your machine has more than one processor or do_some_work spends time doing nothing: say, waiting for results from the network. If the task is pure computation, multi-threading on a single-processor machine will increase the time taken to complete, due to threading overheads. > Assuming that my machine has sufficient resources, I might even get > total execution time down to a minute or so. This would be a major > win for me -- I would like this app to complete within ten minutes > at the outside. > What Perl 'feature' should I explore to do this? > Am I walking into 'threads' here? Yup. Probably 'async'. Make sure you are using a post-5.8.0 perl, and read perldoc perlthrtut. If your tasks really are independant, about the only tricky bit should be making sure all the threads have finished before reporting the results. Ben -- Heracles: Vulture! Here's a titbit for you / A few dried molecules of the gall From the liver of a friend of yours. / Excuse the arrow but I have no spoon. (Ted Hughes, [ Heracles shoots Vulture with arrow. Vulture bursts into ] /Alcestis/) [ flame, and falls out of sight. ] ben@morrow.me.uk
![]() |
0 |
![]() |
Hello Stuart, > I'm not quite sure what 'feature' i'm looking for ... any input > appreciated. > > I want to parallelize a particular task. Have a look at the documentation for the Parallel::ForkManager module, we've used it to great effect for certain tasks. Hope this helps, Simon Taylor
![]() |
0 |
![]() |
Stuart Kendrick wrote: > I want to parallelize a particular task. Forking multiple child processes is very easily done by help of the CPAN module Parallel::ForkManager. -- Gunnar Hjalmarsson Email: http://www.gunnar.cc/cgi-bin/contact.pl
![]() |
0 |
![]() |
It was a dark and stormy night, and Stuart Kendrick managed to scribble: > Hi, > > I'm not quite sure what 'feature' i'm looking for ... any input > appreciated. > > I want to parallelize a particular task. > Depending on the task, it may not run any faster unless you have more than 1 CPU. gtoomey
![]() |
0 |
![]() |
thanx for all the input. turns out that the parallel processes needed read/write access to data structures within the main process ... so i used threads and threads:shared. thanx also for the stylistic pointers ... i'm pulling out & and -w from my scripts now. i'm pleased with the result ... http://www.skendric.com/device/Cisco/shutdown-network ... a script which disables the access layer of our network in about a minute, thanx to the use of threading ... one thread per ethernet switch. i hope we'll never use it ... but in the event of a catastrophic worm infection, i'm going to be real grateful that i have this tool available to me. --sk Stuart Kendrick FHCRC
![]() |
0 |
![]() |