Project

General

Profile

Feature #24015

Parallelize freenas-debug scripts

Added by Ash Gokhale almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Nice to have
Assignee:
Ash Gokhale
Category:
Middleware
Target version:
Estimated time:
Severity:
New
Reason for Closing:
Reason for Blocked:
Needs QA:
No
Needs Doc:
Yes
Needs Merging:
Yes
Needs Automation:
No
Support Suite Ticket:
n/a
Hardware Configuration:

Description

Don't wait on long running tasks to stack; do them in parallel.
The parallel overhead and load seems acceptable.

[root@zoltan-b] ~/ash# time parallel-freenas-debug -A > out
44.314u 17.322s 0:39.58 155.7%  46+170k 877+3088io 18pf+0w
[root@zoltan-b] ~/ash# time freenas-debug -A > out2
43.707u 16.826s 1:49.51 55.2%   41+172k 877+5107io 18pf+0w

[root@zoltan-b] ~/ash# diff -ruN /usr/local/bin/freenas-debug /usr/local/bin/parallel-freenas-debug
--- /usr/local/bin/freenas-debug        2017-05-12 11:34:05.979570000 -0700
+++ /usr/local/bin/parallel-freenas-debug       2017-05-17 06:55:18.075400000 -0700
@@ -260,6 +260,7 @@
        freenas_header 2>&1|tee -a "${FREENAS_DEBUG_DIRECTORY}/osinfo.txt" 

        if $all_debug; then
+               func_pids="" 
                for opt in ${opts_spaced} ; do

                        var=\$$(echo "module_func_${opt}")
@@ -277,10 +278,15 @@
                        then
                                fp="${FREENAS_DEBUG_DIRECTORY}/${directory}" 
                                mkdir -p "${fp}" 
-
-                               eval "${func}" 2>&1|tee -a "${fp}/dump.txt" 
+                               #dispatch the subfunctions in parallel,record  the children 
+                               ( eval "${func}" 2>&1|tee -a "${fp}/dump.txt" ) &
+                               func_pids="$func_pids $!" 
+
                        fi
                done
+               logger "waiting on $func_pids" 
+               wait $func_pids
+               logger "debug complete" 
        else
                while getopts "${aopts}" opt
                do

parallel-fnd.diff (803 Bytes) parallel-fnd.diff Ash Gokhale, 05/17/2017 07:07 AM

Related issues

Related to FreeNAS - Umbrella #24963: Support Team debug dump code improvements and requirementsClosed

Associated revisions

Revision 203a4ded (diff)
Added by William Grzybowski almost 4 years ago

feat(debug): make debug scripts run in parallel Submitted by: Ash Gokhale Ticket: #24015

Revision b30a9b68 (diff)
Added by William Grzybowski almost 4 years ago

feat(debug): make debug scripts run in parallel Submitted by: Ash Gokhale Ticket: #24015 (cherry picked from commit 203a4ded5ed94ed7a7788abefed6aeca8f9bf7fe)

History

#1 Updated by William Grzybowski almost 4 years ago

  • Status changed from Unscreened to Needs Developer Review
  • Assignee changed from William Grzybowski to John Hixson

#2 Updated by John Hixson almost 4 years ago

  • Assignee changed from John Hixson to Ash Gokhale

I'm not convinced this is a good idea ;-) There are numerous commands in the various submodules that do the same thing and possibly write to the same files. While this could work, I think it needs more fine tuning and proper synchronization.

#3 Updated by Ash Gokhale almost 4 years ago

  • Assignee changed from Ash Gokhale to John Hixson

Please explain what broke. It produces correct output on my test system.

Each func writes it's own file and directory via pipe isolation.

#4 Updated by John Hixson almost 4 years ago

  • Assignee changed from John Hixson to Ash Gokhale

Ash Gokhale wrote:

Please explain what broke. It produces correct output on my test system.

Each func writes it's own file and directory via pipe isolation.

Just look at the activedirectory, ldap and smb modules. There are multiple queries that are identical. Lots of the same samba commands run. sqlite already has concurrency issues, to the point that we've implemented hacks in various other shell scripts. I'm sure there are more places throughout this program with similar issues. Please use proper concurrency methods to ensure deadlock does not occur.

#5 Updated by Vaibhav Chauhan almost 4 years ago

  • Target version changed from 11.0-U1 to 11.0-U2

#6 Updated by Bartosz Prokop almost 4 years ago

  • Related to Umbrella #24963: Support Team debug dump code improvements and requirements added

#7 Updated by Bartosz Prokop almost 4 years ago

John,

Can you elaborate some more on this sqlite concurrency issues?

When it comes to the mentioned debug modules I see only read operations when it comes to the sqlite database.

Furthermore sqlite documentation is pointing to the conclusion that sqlite itself is safe when it comes to the concurrent access:

High Concurrency

SQLite supports an unlimited number of simultaneous readers, but it will only allow one writer at any instant in time. For many situations, this is not a problem. Writer queue up. Each application does its database work quickly and moves on, and no lock lasts for more than a few dozen milliseconds. But there are some applications that require more concurrency, and those applications may need to seek a different solution.
(5) Can multiple applications or multiple instances of the same application access a single database file at the same time?

    Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however.

I've tested the parallel-freenas-debug on the HA setup with SMB shares and AD connection.
After running four instances of the parallel-freenas-debug at the same time I haven't saw any issues (obviously apart from the fact that each instance overwrote some of the results of the previous one).

We are using shell utilities within that scripts that are well tested and safe.

I think that we could worry about deadlocks etc. at the lower level of abstraction.

Both me and Ash will appreciate more details and examples from the code.

Thanks,

Bartosz

#8 Updated by Bartosz Prokop almost 4 years ago

  • Assignee changed from Ash Gokhale to John Hixson

#9 Updated by John Hixson almost 4 years ago

  • Assignee changed from John Hixson to Bartosz Prokop

It doesn't matter what I have to say since this is already in freenas. Why is this ticket open still?

#10 Updated by Ash Gokhale almost 4 years ago

Hmm. Code escape.
https://github.com/freenas/freenas/commit/203a4ded5ed94ed7a7788abefed6aeca8f9bf7fe
It was not my intention to escape the review process. If it's incorrect, we should understand why and fix it.

#11 Updated by Dru Lavigne almost 4 years ago

  • Assignee changed from Bartosz Prokop to John Hixson

As per discussion on Telegram, Ash wrote the code, William committed it to master, and John needs to review it as he is familiar with the original code before it can be merged to stable.

#12 Updated by John Hixson almost 4 years ago

  • Status changed from Needs Developer Review to Reviewed by Developer
  • Assignee changed from John Hixson to Release Council

Dru Lavigne wrote:

As per discussion on Telegram, Ash wrote the code, William committed it to master, and John needs to review it as he is familiar with the original code before it can be merged to stable.

I've already stated how I feel about this ;-) The code is already in FreeNAS, no point in removing it.

#13 Updated by Dru Lavigne almost 4 years ago

  • Assignee changed from Release Council to Ash Gokhale

#14 Updated by Vaibhav Chauhan almost 4 years ago

  • Target version changed from 11.0-U2 to 11.0-U3

#15 Updated by Vaibhav Chauhan over 3 years ago

please publish your changes in a PR which will be merged against stable branch, also please let us know here when you have PR ready.

#16 Updated by Vaibhav Chauhan over 3 years ago

  • Status changed from Reviewed by Developer to 47

#17 Updated by Dru Lavigne over 3 years ago

  • Subject changed from parallelize freenas-debug [patch] to parallelize freenas-debug scripts

#18 Updated by Joe Maloney over 3 years ago

  • File debug-bonniemini-20170828120324.tgz added
  • Status changed from 47 to Ready For Release
  • Needs QA changed from Yes to No
  • QA Status Test Passes added
  • QA Status deleted (Not Tested)

Saving debug seems to work as expected.

#19 Updated by Dru Lavigne over 3 years ago

  • Subject changed from parallelize freenas-debug scripts to Parallelize freenas-debug scripts

#20 Updated by Vaibhav Chauhan over 3 years ago

  • Status changed from Ready For Release to Resolved

#21 Updated by Dru Lavigne over 3 years ago

  • File deleted (debug-bonniemini-20170828120324.tgz)

Also available in: Atom PDF