How much is too much custom fields in one run?

I run a test on a pda uplc with custom fields that use values for CConst1 to CConst6 so fairly busy. Then i have 2 derived channels and CCalref1 populated for both channels for use in a summary custom field. Then my main custom field calculates impurity percentages for unknown peaks for samples in my 50 minute run with lots of peaks to calculate.
Although i have confined my cfs as best i can by limiting sample type etc im worried im overloading the raw data space in the project. I only need my summary intersample cfs on injections 4 and 5 so is it best to throw the summarize custom fields function directly under injection 5 to save space? 
I know software packages always claim you can use every single inch of it but does that sound too much. My processing sample set time takes at least 5 minutes!!!

Best Answer

  • MJS
    MJS
    Answer ✓
    I believe the answer is really just as simple as the capacity of the PC/DB server's drive.  There is a distinction between raw data (chromatographic data, ".dat" files) and the tablespace data (all the metadata starting with injection identifiers like sample names to the final results, the various database related files) but it sort of depends on your specific installation configuration (see the excerpt at the bottom).  When processing a large, complex sample set, minimizing the custom fields processing as you say you try to do is great as it reduces the number of calculations performed and, in turn, speeds up processing and reduces the amount of tablespace data generated.  For how complex your processing/cfs appear to be based on all your previous posts, I can't say I'm too surprised by a 5 minute time.

    Just to note...I don't think the location of the summarize custom fields would make any difference.  I believe Empower will still attempt to calculate all fields which meet the the sample type restrictions and will just result in irrelevant/erroneous results for those fields/injections.  To further improve time if it is really that important...minimize project scopes to reduce the sheer number of cfs to only those absolutely relevant, consider the processing methods and their options (e.g. do you continually process for peak purity even though you don't routinely use that information), and probably as a last resort, modify formulae to be more specific where possible (e.g. specific peak names rather than flexible/open-ended/robust formulae so it only generates the value for the specific peaks rather than all).

    The following is an excerpt from Waters doc 715003451, Emp3 Install guide:

    Empower raw data files
    The size of the Empower chromatography raw data files varies according to sampling rates, run times, and number of samples. PDA and MS files are bigger because they are 3-dimensional data (wavelength and mass range must be included). The total space requirement depends on how often you archive and how many systems are creating raw data files.
    Tip: Raw data files can grow very quickly. Hundreds of GB may be needed for raw data. If you use up too much space, you can backup older projects to regain space on the raw data drive.

    Empower database
    The database datafiles (tablespace files) are configured to “autoextend”. As projects, raw data files, and results are created, the initial database datafile must autoextend to store all information.
    Tip: The amount of free disk space limits the extension of the database files. You can add additional tablespace files to other hard drives, space permitting, or free space on the original drive, to allow for adequate extension.

Answers

  • Thanks very much for the reply. I will look into what our capacity is here in work. I have tried to pin down the required custom fields as much as is possible by restricting sample types to unknowns and found peaks etc. Plus the related substances are all run in one project so there is no chance that other peaks from different runs will use up the memory. 
  • More custom fields, specifically ones that have dependencies, will slow down processing of data as well as contributing to tablespace content. The next Service Release of Empower will not have tablespace limitations. While this will make it easier to administer, you may find that projects with many fields and calculations are allowed to run away on their own without a notification to investigate.