The GêBR Project kindly thanks the support of:
Table of Contents
List of Figures
List of Tables
List of Examples
GêBR is a simple graphical interface which facilitates geophysical data processing. GêBR is not a package for processing. Instead it is designed to integrate a large variety of free processing packages, such as Seismic Un*x and Madagascar.
GêBR manages seismic-data processing projects, dealing with multiple 2D lines, each one with its own set of processing flows. Through GêBR, it is possible to assemble and execute such processing flows and track the outputs of the data processing. Everything in a simple and friendly way.
Being a free software, anyone can use and customize GêBR for free, according to the terms of the GNU Public License. That makes this software very attractive for teaching and academic research.
The GêBR Project was initially proposed to develop and promote the GêBR interface. During 2007–2008, the core team of developers had offered several courses on seismic processing with GêBR, at the most active research centers in Geophysics in Brazil. Around 200 students and professionals had direct contact with the project on that period. In that process the GêBR observed a demand for integration efforts among Brazilian geophysical community. That motivated a redefinition for the project’s goals.
The main objective of GêBR is to stimulate the integration of the Brazilian Geophysical Community by providing an interface to geophysical data processing that could be used not only for teaching purposes but also as a research dissemination vehicle.
For more information, visit the project's official site: http://www.gebrproject.com.
This release of GêBR brings a lot of new exciting features to ease the handling and the management of flows and programs, including:
This user guide is for GêBR version 0.20.0. The images in this guide were captured in a Ubuntu 10.04 based system. Therefore, slight differences may be observed in case another operating system is being used. For specific installation instructions for each operating system, see the install guide, in the project's official site.
Director:
Ricardo Biloti
<biloti@gebrproject.com>
Developer manager:
Fabrício Matheus Gonçalves
<fmatheus@gebrproject.com>
Developers:
<keiji.eric@gebrproject.com>
<ian.liu@gebrproject.com>
<igor.snunes@gebrproject.com>
<jorge.pzt@gebrproject.com>
<juliakm@gebrproject.com>
<querencia@gebrproject.com>
Consultants:
Former members:
Thank you for using GêBR!
Table of Contents
GêBR is primarily conceived to process 2D seismic data lines. Data from 2D acquisition lines related somehow are usually grouped in one seismic-data processing project. The processing of a specific data is carried out by a bunch of processing flows. Each processing flow is a chain of programs, which sequentially operate on an input data to produce an output data from it. For example, processing flows can be assembled to accomplish tasks like Spike Deconvolution, Noise Atenuation, NMO correction, and so on.
In GêBR, a seismic-data processing project is referenced as a project only. Each project can hold many seismic data, each one referenced as a line. Each line has its own set of processing flows or just flows for short.
In summary, GêBR has three levels organization, from top to bottom:
Project: It is a set of lines. A project has only a few basic information and is used to cluster lines related somehow.
Line: It is a set of flows. During the setup of the line, some important choices have to be made, mainly concerning the organization of the processing products (intermediate data, figures, tables, etc) produced during the processing of the seismic data.
Flow: It is a sequence of programs, designed to accomplish a specific task, in the course of seismic data processing.
Thus, before creating and executing flows, it's necessary to create at least one project owning one line. Section 3.1, “Creating projects and lines”, and Section 4.2, “Creating flows” explain how this is accomplished.
To assemble a flow, the user has not only to select programs, but also configure them properly, through their parameters. Once the flow is configured, it is ready to actually be executed.
A flow is a sequence of programs, as just explained. In GêBR, the user might think that there is a list of available programs to assemble flows. This is partially true only. Indeed, flows are built from menus. But what is a menu?
A menu is a representation of a single program or a set of programs. This means that when a menu is added to a flow one or more programs will be inserted into the flow at once. Why is that so? Think about common tasks that are accomplished by a standard sequence of programs. Instead of building a flow by adding programs one by one, the flow could be built from a menu, which packs the whole set of programs. For example, consider the task of adding header to a raw data to come up with a seismic data in Seismic Un*x format. Figure 2.1, “Add header flows” shows two possibilities to assemble the same flow, depending on how the programs SU Add Header and SU Set Header are represented, either as two independent menus or as one menu encapsulating both programs.
Even when a menu represents more than one program, at the moment the menu is added to the flow, all programs which comprises it are added independently to the flow. This means that resulting flow will be the same, no matter how the programs have been added, at once or one by one.
After assembling the flow, to complete its set up, the user has to inspect each program of the flow and define its parameters. Programs may depend on many parameters, from a variety of types. GêBR supports parameters of the following types:
GêBR supports arithmetic expressions to define numeric parameters, and text concatenation for text parameters. Besides, it is possible to define quantities and operate with them to set up parameters of the programs.
The GêBR set of tools is composed by three main player categories: GêBR (the interface itself), maestros, and nodes. But who are those guys?
The GêBR interface, sometimes referred as GêBR only, is the graphical interface with whom the user interacts with to build flows, execute them, inspect results, etc. Usually, this interface is installed and is running in the machine the user has physical contact with.
The user may have access to many machines, which is an usual scenario for users of a computational laboratory. Through GêBR, it is possible to take advantage of this set of machines to run flows. Those machines, or processing nodes, are simply referenced in the GêBR as nodes. They may be local, meaning the user is physically using them, or remote, meaning the user has access to them through the network. No matter where the machines are, they are equally treated by GêBR, which means that the user does not have to care about that.
However, GêBR does not talk directly to the nodes. This communication is stablished with the intervenience of another player, the maestro. When the user decides to execute a flow, GêBR sends the request to the maestro, giving rise to a job. The maestro acts as a coordinator of the nodes, collecting information and ranking them according to their capabilities and available resources. Therefore, the maestro can take smart decisions about which machines are best suited to run a job.
The nodes put their computational power at disposal of the maestro. Under maestro's coordination the nodes can even cooperate to conclude processing jobs fasters.
The maestro receives the job submission and dispatches the job to some of the nodes under its control. All the information generated by the job is collected and sent back to GêBR, where they are presented to the user.
This communication process, despite complex, is completely transparent to the user, which can concentrate on processing seismic data, leaving all technical details to GêBR.
The GêBR interface is intentionally simple. It is organized in three tabs:
There are also additional resources in menu bar, on the top of the window:
Table of Contents
Projects and lines are the organizer entities and are managed in the Projects and Lines tab. Seismic processing lines can be created and manipulated here. Lines can be grouped in projects, for sake of organization. The information presented in the Flows tab depends on the line selected here.
Try using a context menu instead of the buttons on the toolbar. To do so, right-click on one of the projects or lines which has already been created. Notice that many commands can be found in this context menu. For certain cases this method is even easier and faster then using the toolbar buttons.
The button creates a project. The title and description of the project, besides user's email are requested in the creation. Such basic information can be further edited through the edit button .
The project will appear on the left side of the window (see Figure 3.1, “Projects and Lines tab”). Information about the project is shown on the right hand side of GêBR's main window. Note that some of this information comes from what was previously specified. Details such as creation date and modified date are automatically generated by GêBR.
The button creates a line. The information belew will be requested to complete the operation:
Basic properties of the line, like title, description, author, and e-mail address (they can be further edited through the button );
The BASE path is a directory where most of the files referred in the line's flows should be placed (see Section 3.3, “Important folders of a line”);
The IMPORT path is a directory used to easily find files outside the BASE directory.
This information is aided by an assistant dialog. When the assistant is completed, the line will be created and shown as part of the selected project. Like the project's settings, the line's settings are exhibited on the right side of GêBR's window.
To create lines, GêBR must be connected to a maestro (see Section 7.3, “Maestro/Nodes configuration”). Otherwise, GêBR will not be able to submit flows for running, since this is done through the maestro intervenience.
To delete a line, select it and then left-click on .
When a line is deleted all its flows are also deleted. This operation cannot be reverted.
To delete a project, it must not contain lines, so it is necessary to delete them first, and then proceed to the project. This is a protection to avoid miss-clicking the delete button.
GêBR defines a directory structure to aid in the task of managing the data associated to a set of flows. This structure is defined by a top directory, known as BASE, and few other nested standard directories. They are:
Other important folders are the HOME and IMPORT. HOME is automatically set to the path of the user area. The IMPORT, unlike the previous mentioned folders, is not associated to the BASE path and can be independently set.
For instance, suppose we are going to set the important folders of a line. The IMPORT is totally independent and can be set anywhere, e.g., /tmp/data. If the home folder is /home/john/, the BASE path must be set inside /home/john/. Taking this into consideration, and setting the BASE path to /home/john/GeBR/MyFirstLine/, this is going to be the directories structure associated to the current line:
This standard directory structure eases the task of managing the data related to lines. In the input/ouput/log fields, this feature can be used by embracing them in angle bracktes (for instance, <BASE>). For more information, consult Section 4.5, “Editing the flow's input and output files”.
Remember that the processing takes place at the processing node. The directory structure described above will be created there. This means that these directories will not be present on the local machine if the node is a remote machine.
To be able to browse files in these directory, the remote browse feature must be enabled in the maestro of the line (see Section 8.2, “Accessing remote files”).
Although it is not mandatory to adhere to this structure, it is highly recommended to do so.
GêBR supports only one connected maestro at time. However, there may be several maestros at disposal, one for the University A and another for University B, for example.
Whenever a maestro is connected, all lines associates to other maestros will be unavailable for edition. However, a line can be migrated from its original maestro to the connected one. To do so, the line must be selected. In the right-hand side panel, an icon in the upper left corner indicates whether the maestro defined for the line is connected or not. In case the original maestro is not the connected one, by clicking over that icon, GêBR will offer two options:
disconnect from the current maestro and connect to the maestro associated to the line; or
dissociate the line from its original maestro and associate it to the connected maestro.
Whenever a line is migrated from one maestro to another, all data in the BASE directory and in its children will remain in the nodes of the original maestro.
Although GêBR continuously saves all data, sometimes it is desirable to have copies of projects and/or lines in a file (for example, to share them with others or to make backups). To export a project or a line:
Select the line or project that that will be saved;
Left-click on ;
A dialog will be shown. Choose a directory and a name for the file;
Left-click on the
button to conclude the export.
GêBR determines the extension automatically, i.e.,
.prjx
for projects and
.lnex
for lines.
To import projects or lines that were previously exported follow the steps:
Click on the button ;
Select the desired project or line to import in the dialog that
appeared. Only files with extension
prjz
, prjx
,
lnez
or
lnex
will be shown;
Click on the
button.An imported project is added to the list of projects, while an imported line joins the other lines of project that was selected prior to this process. In both cases the imported item is identified with the suffix Imported.
In GêBR, lines may only exist inside a project (see Section 2.1, “Projects, lines and flows”). Therefore to import a line first select an existing project, or create a new one.
Projects and lines can be commented and have an automatic report generation. To edit their commentaries, follow the steps:
Choose a project or line;
Click on the
button on the toolbar;Choose Edit Comments;
Fill in some comment in the text editor that is presented;
When finished, save and close the editor.
You can print the report by clicking
→The report can be visualized by clicking over
→ .After clicking on View Report, a window will appear showing the generated report for that project or line. This window has some functionalities that are mentioned below:
Include comments for this line into the report;
Include the reports of all flow of this line;
This option will include reports from all snapshots of all the flows that compose this line (see Section 5.3, “Snapshot of a flow”);
Allows to choose how much information about the parameters will be shown.
Change the presentation style of the report.
The option to include report of flows' snapshots are available on line's and flow's View Report. More information about snapshots can be found on section Section 5.3, “Snapshot of a flow”
Table of Contents
In the Flows tab, processing flows can be created, edited and submitted for execution. After creating a flow, the desired programs can be added to it and the I/O files and parameters can be configured. Once properly configured (GêBR warns if not ), the flow can be submitted for execution. Prior to the submission, execution details can be set,
The Flows tab is used to manage the flows of a particular line. Notice the similarity of this tab with the Projects and Lines tab.
The left panel contains the list of flows that composes the selected line in the previous tab, presenting the information if the flow has or not snapshots, while the right panel exhibits information of the currently selected flow, divided in two parts:
The right panel can present the snapshots view, whether the user clicks on the snapshots icon. Thus the summary of the flow will be replaced by the snapshots view (check Section 5.3, “Snapshot of a flow”)
In Section 2.2, “Menus, programs and their parameters” a menu was defined as a set of one or more programs that can process data. These menus are accessible in the Flows tab, trough the button menu list presented in the toolbar.
After clicking the menu list button, a pop-up window is shown with a list of categories. All menus registered inside GêBR are divided into these categories, which can be expanded by clicking the icon on the left.
A menu can appear more than once in the unfiltered list, since it can pertain to any number of categories.
By typing parts of the menu title or description in the text box on the top, it is possible to filter the menus list. This eases the task of finding or discovering a menu when its domain is known. For example, it is possible to search for menus related to migration (see Figure 4.2, “Filtering the menus list”).
In the Flows tab, the button creates a flow. Title, description and flow's author should be provided in the creation, and can be further edited through the button .
After following these steps, the flow will be appear on the left side of the main window. The right side will show information about this flow, like its title, description and modified date.
It's possible to alter the position of the flow in the list by dragging the flow with the mouse to the desired position.
A processing flow is a sequence of operations defined by the user. These operations, also called programs, are organized into the following categories according to their purpose:
Data Compression
Editing, Sorting and Manipulation
File tools
Filtering, Transforms and Attributes
Gain, NMO, Stack and Standard Processes
Graphics
Import/Export
Migration and Dip Moveout
Multiple Supression
Seismic Unix
Simulation and Model Building
Utilities
After the flow is created, a pop-up appears with the available menus and is possible assemble the flow:
Choose one of them and include it in the flow with a double-click.
To specify an order to the programs, simply drag and drop the desired programs.
Inserted program comes in an enabled state (unless they have required parameters).
Some programs have required parameters. To edit or to alter the default parameters, consult Section 4.4, “Editing program's parameters”.
To change the program status, right-click over it and choose the first option, Enable/Disable (for more information, see Section 4.3, “Program states”).
To run a flow all the programs listed in the Flow Sequence box must be enabled (). Otherwise the flow will not be executed as expected.
The flow is ready to be executed. Click on or on to do it (for more information, consult Section 4.7, “Executing flows”).
After the flow has been assembled, it will be visible on the left side of the main window when the Flows tab is selected. The Details box, found on the upper right side of the main window, shows information about the selected flow. The Flow Review box shows a brief of the flow and contains information like input,output and log file, flow's programs and some of its parameters.
Programs can be in two states only, Enabled () or Disabled (). If a program is enabled with an error, the icon changes to .
Alternate between these states by using two methods: Spacebar to change the states of the selected programs. It's possible to select several programs whose state is desired to change by holding Ctrl+ or Shift+ .
on the program and then select the desired state from the context menu or using the shortcutChanging a program state does not alter its parameter configuration. This way, alternate between states is an operation completely safe.
Disabled programs () will be ignored when the flow runs. This way, the user can enable and disable parts of the flow.
Program's parameters compose a set of initial configurations defined by the user.
To edit program's parameters follow the steps below:
Select the program from the list and click on on the toolbar, or just double-click on the program. The Parameter box will appear on right side of window over the Flow Review box.
Edit the program's parameters. Notice that each parameter vary greatly both in size and type. In parameters fields that can be filled with numbers/text, variables can be used (Section 5.2, “Using variables”).
Click on the program's documentation. This will be certainly useful when the user is editing the programs parameters.
button (bottom right corner of the dialog box) to view theClick on the
button to return to default configurations.All changes are saved automatically, and to close the Parameter box, simply change selection on left side.
In many occasions, it's necessary to extract data from an input file and/or generate as a result an output file, or even an log file in case an error occurs.
To associate an input, output and error file to a flow, follow these steps:
Select a flow, in the Flows tab.
Below the selected flow, the programs and entries with input, output and log files will be shown. To edit files paths, just double clicking on them.
Type in the path (important folders can be consulted/choosen by typing <) or click on to browse for the file (see Section 8.2, “Accessing remote files”).
For using the folders associated to the line, choose them like above or use the feature of autocomplete of the important folders of this line. For more information, see Section 3.3, “Important folders of a line”.
When a path is chosen for the flow's input/output, their file paths will appear in entries, indicated by the icons and , below and above the flow's programs (if there are any).
If necessary removing a set from the list, select them by pressing right-click and then click on .
The clipboard provides the popular set of tools known as copy and paste. A flow (or set of flows) can be copied to the clipboard and paste back to the same line or to other lines. In the same manner, a program (or set of programs) can be copied to the clipboard too, and can be paste to the same flow or to other flows.
To copy flows or programs to the clipboard, first select them with the mouse, and then use the button or press the usual shortcut Ctrl+C. After copying the user can paste it by clicking on the button or by simply using the shortcut Ctrl+V.
To delete a flow (or set of flows) and a program (or set a programs), select them then click on .
It's possible to handle several flows at once by pressing Ctrl+ or Shift+ .
GêBR does not ask for confirmation before deleting programs from a flow.
Once assembled, a flow can be executed in two ways:
More than one flow can be selected together and executed. To make a multiple selection, click over the flows, while Ctrl is pressed or use the arrow keys while Shift is pressed.
After being triggered, the execution can be followed in the same window or, for more detailed information, in the Jobs tab (see Section 4.8, “Following the execution of a flow” and Chapter 6, The Jobs tab).
By clicking on , the job will be triggered with the following settings:
It is a fast way to execute, skipping the configurations of the Run and Setup. Useful keybinds are available here: Ctrl+R runs the selected flows one after another; Ctrl+Shift+R runs the selected flows parallely (keybind activated just if multiple flows are selected).
Some execution configuration (like group's nodes, number of tasks splitted and priority) can be saved using Setup and Run. That configuration is going to be a new default for basic Run. (For more information, see Section 4.7.2, “Setup and run”)
Many execution details can be set before triggering a job. After setting, click on Run to execute the job.
By clicking over , it can be set:
Itens 3, 4 and 5 can be saved for future executions by checking Save Preferences.
Example 4.1. Parallel execution
On a multi-core system, create a flow according to the following steps:
It can be seen that:
This flow is parallelizable only because one step of the loop is totally independent of another, so, in a system with multiple machines and multiple cores, this flow can be divided to execute all their tasks on the same time, making the execution faster. Therefore, all flows with the same characteristic can take advantage of it.
After a flow has been sent to execution, the status and output of the yielded job can be followed in the same window, still in the Flows tab. A bar in the botton of the window will appear to keep the history of the last jobs dispatched for execution. Through buttons in that bar it is possible to switch among the results of the many jobs.
Each job execution is represented in the bar as a button, composed by:
Different from the Jobs tab, in the Flows tab it is presented just a quick view of the job. Indeed, the aim of this features is not to present a full view of the job, but to avoid the switching of tabs to follow jobs. The arrow takes to the Jobs tab, where a complete view of the flow execution can be consulted.
When multiple flows are dispatched for execution, instead of staying in the same window, GêBR automatically switches to the Jobs tab, where all the executions can be seen all together.
Table of Contents
Although GêBR maintains all the data, sometimes it's necessary to copy the flow to files for sharing.
In the window Save flow, navigate
to a desired directory and type a name for the file
(GêBR will automatically determine the extension
flw
). To do so:
In case of Import, the user must first select a flow (or see Section 3.1, “Creating projects and lines”, if one does not exist).
Select the tab Flows and left-click on to import or on to export.
Navigate through the files and select the flow that is desired to be imported/export.
In case of Import, the imported flow will be listed along with any other flows of the selected line. Otherwise, in case of Export, the file containing the exported flow will be created.
It is usual in the course of processing a data that many flows depend on some fundamental quantities, like acquisition parameters, for example. Those quantities may be provided explicitly to each flow. Consider however a scenario where one or some of those quantities have to be redefined. The user would have to go through all flows which depend on them making the necessary updates. Despite possible, this procedure is error-prone and time consuming. A better approach would be centralize the definition of those common quantities, in such a way that a change on them would be automatically propagated to all flows where they are employed. This central place for definition of these "quantities" is the Dictionary of variables.
The Dictionary of variables is an interface to handle all the variables. It is accessed through the icon , in the More menu of toolbars of the Projects and Lines and Flows tabs.
GêBR has three levels of organization (see Section 2.1, “Projects, lines and flows”). The variables, thus, have three levels of visibility:
The dictionary validates the variables dynamically, revalidating them as soon as anything is changed in dictionary. Programs with errors are automatically revalidated too (see Section 4.3, “Program states”).
Positioning the pointer on the icon of the variable type, the user can check the solved expression. If the Dictionary of variables finds an error among the variables, it is going to exhibit the icon and an explanation of the error. Reordering the variables of the dictionary can be done by drag and drop. Since a variable can just use another variable above it, this feature turns the declaration of variables more flexible and dynamic.
Besides variables and expressions, some predefined functions can be used:
Table 5.1. Available functions
Function | Sintax |
---|---|
Square root | sqrt (value) |
Sine | s (value) |
Cosine | c (value) |
Arctangent | a (value) |
Natural logarithm | l (value) |
Exponential | e (value) |
Bessel | j (order, value) |
The character [' (open brackets) is used to see all the available variables for auto-completion.
Navigate over the fields can be done by using the keys Enter, Tab or with the .
To use the variables in fields like [ name-of-the-variable ].
, the variable name need to be embraced by square bracketsExample 5.1. Using the dictionary
Dictionary is a very useful feature for using a same value multiple times. It is also useful for naming variables, making a flow's parameters more intuitive. For example:
It can be seen that:
GêBR allows the execution of repetitive procedures by creating flows with Loops (see Section 5.4, “Flows with loops”). With Loops, the user has access to a special variable called iter.
Upon execution, the user might want to identify what is the current step of the Loop. The iter variable is devised with this aim.
For instance, the output of each step of the Loop can be defined to a file with a name identified by the step, output-<number-of-steps> (see Figure 5.2, “Usage of iter variable”).
There are situations when working with GêBR it will be useful to use an existing flow as a basis to create new flows. For example, to experiment with flows that are only slightly different from one another, however without having to discard or modify the original flow.
A way to do this is to copy and paste the flow (see the section Section 4.6, “Copying, pasting and deleting flows and programs”. Another way is to take a snapshot of the original flow, through a snapshot, GêBR stores the settings of a flow so it's easy come back to it later.
All snapshots of the flow are presented in the flows tab. To take a snapshot of a flow:
Select the flow and left-click on More button on Toolbar.
Then click on (or press Ctrl+S) to take a snapshot
It's really necessary to write a description of the snapshot in the dialog box to identify this saved state later on.
Left-click on
to save your snapshotTo see the saved snapshots, just left-click on in the details of a flow.
A user that uses different versions of the same flow may also be benefited by the use of snapshots. An option would be to save many different flows, one for each version, which drawback is the pollution of the flows tab. Another option, using the feature above, is taking a snapshot of each version of the flow.
The graph indicates from which version each snapshot was derived from. The start of arrow indicates the origin and the end indicates the modified version.
GêBR always preserves the setup of the flow. Even if the computer was turned off, this list will be recovered when GêBR is opened again. However, if the flow was deleted all the saved states will immediately disappear forever.
Through a right-click over the target snapshot, the user has the option to Revert, Delete or Run it. The right-click over the current stage (Now) allows the user to Run the current flow or to take a snapshot thereof.
In this view, also is possible use multiple selection just click over several snapshots and the actions are disposable through the right-click over a snapshot. After to choose several snapshots and the action, this action will be applied in all selected snapshots.
To deselect a snapshot is as easy as select one, the user just need to click over the undesired snapshot. Also is possible deselect all the selected snapshots by clicking on the white area in this view.
If a snapshot of a flow was taken and for some reason undesired modifications were made to it, is possible to revert to another one through double-click on that. Whenever a snapshot is reverted, the dialog Reverting modified flow will appear.
If the current state was previously saved, GêBR recommends that the user click on Backup Current State, as to avoid a bloated snapshot view.
, in the dialog boxThe delete of a snapshot is a permanently action. After do that, this snapshot will disappear and cannot be recovered.
GêBR deals with the idea of origin of the snapshot. In the graph of the figure, 'B1' and 'B2' come from 'B' which, by its turn, comes from 'A'. If the user deletes 'B', GêBR will consider 'B1'and 'B2' as originated from 'A'.
Running a snapshot is like running a flow (see Section 4.7, “Executing flows”). The advantage of doing this is the possibility of executing a snapshot without having to revert to it. Thus, saved flows can be tested without changing the current one. As with flows, Ctrl+R runs the selected snapshots one after another; Ctrl+Shift+R runs the selected snapshots parallely (keybind activated just if multiple snapshots are selected); the Delete key deletes the selected snapshots;
Example 5.2. Using snapshots
Take the program SU Plane for that example. That program create common offset with up to 3 planes, and there are two possible output options for that flow: image plot, using program X Image or postscript image, using PS Image.
To solve this issue (two possible outputs for a flow) without the need to create two separate flows with small differences, it is possible to take two snapshots: one for image plot, and the other for postscript.
To do that, following the steps below:
Now, with only one flow can run two different plots for the data generated from SU Plane.
The notion of a loop refers to a series of commands that continues to repeat over and over again, possibly changing a little, untill a condition is met. For example, suppose the user needs to generate one plot for each of a set of functions. One option is to create many flows, each one based in a function, that generates the plots. Another simpler option would be to create a single flow that, for each function, generates a plot. The loop program allows it in GêBR.
The Loop is a program from the category Loops that has a totally different usage compared to the remaining programs of GêBR. Here these differences are presented.
Whenever the program Loop is added, it appears on the top of the flow. That happens to indicate that the flow is going to be executed more than once, according to the parameters set for the program. (see Section 4.4, “Editing program's parameters” for further details).
After the Loop is added, a new variable, the iter, is available (see Section 5.2, “Using variables”). The value of this special variable is modified on each iteration (increasing or decrementing), according to the parameters set.
The option about the selected flow, just like the option presented in the Projects and Lines tab. Analogously, the option → allows the user to visualize the report of the current flow The difference is that the line's report can include the flow's report. Consult Section 3.6, “Report of a project or a line”.
→ allows the user to add commentsTable of Contents
The jobs submitted for execution in the Flows tab can be followed in Job Control tab. As anywhere else in GêBR, the information presented here is preserved even when GêBR is closed. It is possible to consult the flow's state (if it has ended, has failed or is queued), the results and the submission details.
As seen in Section 4.7, “Executing flows”, GêBR switches to the Jobs tab whenever multiple flows are executed, in other cases, the user can switch through Flows tab, using a arrow on info bar to fast view of jobs, localized on bottom of tab.
This tab has three sections:
Toolbar: Operations over jobs
Left Panel: List of jobs
Righ Panel: Details of a selected job
In this tab the user can follow and, if so desired, interrupt the flow's execution. The available commands are:
Save ( ) to disk the output of the selected flow up to the moment.
Clear job (). Eliminate from the Jobs the selected job. Caution is needed because the process is irreversible.
The user can only clear jobs that are not being executed ( or ).
Cancel this job ( ). The user can cancel a running ( ) or queued ( ) job. When a job is terminated in this manner, the job will be marked with .
Filter jobs ( ). The user has the option to see just the desired results.
As in anywhere else in GêBR, many of the toolbar functionalities can be accessed through the context menu, using the mouse's right button.
In the left panel, GêBR presents a list of flows executed by the user, ordered by moment of execution. Each exhibited flow can be in one of four states:
Running ( ): job in execution;
Finished ( ): already completed job;
Canceled ( ): stopped
Beside the job name, an icon ( ) appears to represent that this job came from an execution of snapshot (see Section 5.3.2, “Snapshot actions”).
In the upper part, information of the selected flow is exhibited. More informations are shown clicking on Details. Among the exhibited information there are:
Description of related flow. If the job is from a snapshot, the icon is displayed along with the description of snapshot.
Moment of start/finish of the flow.
Elapsed time in execution
Input/output files
The nodes group in which the flow were executed
Which machines were effectively used in the group and how the distribution of the work was done can be consulted by clicking on .
Maestro
GêBR (client)
Total number of processors used in the execution
Distribution of the jobs to the nodes
The results (standard output) of the flow are shown. The redirected output are not exhibited. Right-clicking on this window, an Options menu is exhibited, by which the user can enable the automatic word wrap and the automatic scroll of window as the output is being shown.
The user can also consult the command lines executed by GêBR. He just needs to visit the Command Line tab. The command lines are portable and can be copy-pasted to execution on a terminal.
Table of Contents
From the
menu, the user can access the and configure the settings. If the user's enthusiasm for GêBR has lead to read the whole manual, then the user has probably used almost all available features found in these options. But just in case, the user will find the documentation for these options below.The details the user provides in the
dialog box will be adopted as the default by GêBR. Specifically:User name: will be used as the default for Author, when the user creates projects, lines and flows.
Email: will be used as the default for Email, when the user creates projects, lines and flows.
User's menus directory: will be
the default directory where GêBR's
Menus are maintained
(mnu
files).
This option helps the user to configure all the necessary connections to work properly GêBR.
The connection assistant explains the user to connect to a maestro, its servers and enable remote browsing (for more information, see Section 7.3, “Maestro/Nodes configuration” and Section 8.2, “Accessing remote files”).
The Maestro/Nodes Interface has three parts:
Maestro: central unit of coordination of multiple machines (for more information, see Section 2.3, “GêBR players”).
Nodes: choose the nodes associated to the current maestro.
Groups: configure the groups of nodes on the current maestro.
In the upper part of the /Nodes Interface (see Figure 7.2, “Maestro / Nodes configuration dialog box”), there is an entry in which the user can put the desired machine to use as maestro. A requirement is that the maestro must be installed on that machine. The state of the connection between GêBR and the maestro is shown by the right-side icon:
connected ()
disconnected ()
error ()
Once the maestro has been selected and the connection has been successfully established, the user can associate nodes to that maestro.
Each line of the nodes window represents each machine associated to the maestro. The lines have some columns:
The user can add a new machine by clicking over New and entering the hostname or IP address of the machine.
It's possible to employ a subset of the machines associated to the maestro. This can be done through the creation of groups of machines.
The administration of the groups is in the bottom part of the Maestro/Nodes interface (see Figure 7.2, “Maestro / Nodes configuration dialog box”).
To create a group, click over a node in the list and drag it to the icon. A text box will be prompted for the name of the group. Groups with same names are not allowed.
Example 7.1. Using the groups
Using groups is very useful when it's necessary to run a flow in a subset of machines different from the default (For more information, see Section 4.7.2, “Setup and run” ). That's useful, Because sometimes it's desired save resources from some processing nodes to not overload all of them.
For example:
It's easy to see that the execution was distributed in the nodes of subset (group) chosen by the user and not in all the processing nodes.
In
→ menu, there are samples to be imported.GêBR offers bundled samples for download in its hostpage, hostpage like gebr-menus-su package. Here are some samples this package provides:
Once imported, the associated flows can be readily consulted or executed (consult Section 4.7, “Executing flows”).
Table of Contents
GêBR can take benefit of the resources of multiple processing nodes. To handle it, GêBR is segmented in three layers:
GêBR can be connected to only one maestro at once. Each maestro, in turn, can be connected to many nodes. However, those processing nodes must share the file system containing the user's home directory. This is usually provided by Network File System (NFS) infrastructure.
GêBR model comprises communications between processing nodes (see Figure 2.2, “Communication layout between GêBR players”), namely:
All connections are performed using Secure Shell (SSH) protocol. SSH is intended to provide secure encrypted communication over a network. A login password may be asked multiple times to establish such connections. This can be cumbersome, if there are a lot of nodes under the maestro domain.
The SSH public-key authentication is an alternative method to establish connections which eliminates the need of requests for passwords. Despite less annoying, this method is equally secure. It is based on public-key cryptography, where encryption and decryption use public/private key pair for authentication purposes. The maestro knows the public key and only the client knows the associated private key.
By checking the option Use encryption key to automatically authenticate the next session on password dialog, GêBR will use public/private key authentication. Once this operation is successfully done, there will be no need to type the password to connect to that maestro through GêBR. Analogous behavior occurs in the connection between maestros and nodes.
Alternatively, the private/public key pair can be created without GêBR (consult here for more information).
GêBR infrastructure comprises GêBR, maestro and nodes as main actors (see Section 8.1, “Intercommunication between players”) and the processing files, therefore, may be on different places than the user's node.
No matter the node where the file is, it can be accessed through the remote navigation (see window below).
In two places GêBR will not browsing the nodes filesystem: Import/Export of projects, lines or flows. On these cases, backup can be saved on the local machine, instead of remote nodes.
GêBR puts markers (bookmarks) to the important folders of the line in context (see Section 3.3, “Important folders of a line”). They aim to facilitate the access to the files of the line.
In external file browser (say Nautilus), these bookmarks will appear there too. They disappear when GêBR is closed.
GêBR takes advantage of the multi-core feature of most of the recent machines. The execution of repetitive flows can be optimized by this resource. If the flow has Loops and is parallelizable (given some criteria, shown below), the execution performance based on the number of processors can be adjusted.
GêBR also takes advantage of the number of processing nodes connected to the user's maestro. The resources of every node can be effectively employed to improve the performance.
Accordingly, GêBR can parallelize the flow if:
and if the flow is parallelizable.
To be considered parallelizable, besides having a loop, a flow must achieve one, and just one, of these criteria:
The level of parallelism of a job execution can be adjusted by the Advanced Execution settings (see Section 4.7.2, “Setup and run”).
In case the flow is not parallelizable (i.e., does not fit any of the above conditions), GêBR runs the job in the choosen node.
Computers, nowadays, are multitasked, what means that multiple things can be done at the same time. When many tasks are executed at the same time, the computer can get overloaded and decrease its performance. Seismic processing, particularly, can exhaust the computer resources.
GêBR has a feature that overcomes the issue of overloading due to multitasking, by enabling the execution of the flows in a Wait for free resources state:
Two options are available (see Section 4.7.2, “Setup and run”):
Dispute resources (the execution of the flow is going to dispute for the computer resources with all the other active programs).
Wait for free resources (the flow is going to wait its turn to execute and try not to overload the system).
Technically, when running in Wait for free resources mode, GêBR will reduce the execution priority of the task, meaning it will tell the computer that "it can wait more important things to be done". This is the case when the user has other things to do while waits the calculations to be done.
The Share available resources mode means GêBR will use greater run priority for the task, and that implies it will act as a foreground process, demanding more resources. It's the "I need this done now" mode, when the user needs the job to be finished as soon as possible, and doesn't care if it will fight for resources with other programs.
If GêBR is the only program executing on the node, i.e., it doesn't have a challenger for the computer resources, then both states corresponds to the same situation. This is the "nightly job" situation, when (theoretically) no one is using the processing nodes and some jobs are left to be done for the next morning.
Some programs, known as parallel programs, deal themselves with the distribution of their computations among many nodes. In this way, those programs are much more efficient, exploit all available resources to run faster. Technically, they employ a infrastructure known as MPI. GêBR supports parallel programs.
There are many implementations (flavors) of MPI, but the most widely used are OpenMPI and MPICH2. OpenMPI is an open source implementation and has influence of three earlier approaches: FT-MPI, LA-MPI and LAM/MPI. The MPICH2 is another widely used implementation and is based on the MPI-2.2 standard.
GêBR supports both OpenMPI and MPICH2. However, MPI programs can only run on nodes that support the execution of MPI. Thus, for GêBR support of MPI, the user must have acces to nodes/clusters that also support it.
In the maestro/nodes interface, in the MPI column, an icon indicates whether the node supports MPI. Roll over the icon to check the flavors of MPI available on that processing nodes. The parallel program can be executed just on processing nodes that support the same flavor as the one used in it.
To run an MPI program, first it's necessary create a menu in DêBR and choose the proper MPI implementation to the program. Then import it in GêBR. With double-click over the MPI program in the Flows tab, the number of processes to be used by that MPI call can be seen.
The system administrator can configure global options appliable to everyone that starts GêBR for the first time.
GêBR verifies the existence of an environment variable
(GEBR_DEFAULT_MAESTRO) to suggest a maestro.
The syntax of
this variable is maestro_A, description_A; maestro_B,
description_B
.
For instance, if the administrator sets GEBR_DEFAULT_MAESTRO as
127.0.0.1, My first maestro; 100.00.00.0, My second maestro;
then, on the first time or through Connection Assistant
(see Section 7.2, “Connection assistant”),
GêBR will offer the following options:
Additionally, GêBR can also automatically
add processing nodes. This
configuration can be done through a file in the path
/etc/gebr/servers.conf
.
The syntax of this file must be the names of the nodes enclosed by brackets.
For instance, if the administrator sets /etc/gebr/servers.conf
with
[node1] [node2] [node3]
then for every user with that maestro, the processing nodes node1, node2 and node3 will be available.
Place where the variables are defined.
See Also Variable.
Sequence of operations over over an input data.
See Also Program.
Subset of processing nodes of the maestro.
See Also Maestro, Processing nodes.
Seismic data composed of processing flows.
See Also Flow.
Resources manager of the pool of processing nodes.
See Also Client, Processing nodes.
Representation of a single program or a set of programs.
See Also Program.
Piece of job to be done.
See Also Job.
Value or information stored for posterior use.
See Also Dictionary of variables.