Online Computing

General

The online Web server front page is available here. This Drupal section will hold complementary informations.
A list of all operation manuals (beyond detector sub-systems) is available at You do not have access to view this node.
Please use it a startup page.

Detector sub-systems operation procedures - Updated 2008, requested confirmation for 2009

Online computing run preparation plans

This page will list by year action items, run plans and opened questions. It will server as a repository for documents serving as basis for drawing the requirements. To see documents in this tree, you must belong to the Software and Computing OG (the pages are not public).

Run 19

Feedback from software coordinators

Active feedback

Sub-system	Coordinator	Calibration POC	Online monitoring POC
MTD	Rongrong Ma	- same -	- same -
EMC	Raghav Kunnawalkam Elayavalli Nick Lukow	- same -	Note: L2algo, bemc and bsmdstatus
EPD	Prashant Shanmuganathan	N/A	- same -
BTOF	Frank Geurts	- same -	Frank Geurts Zaochen Ye
ETOF	Florian Seck	- same -	Florian Seck Philipp Weidenkaff
HLT	Hongwei Ke	- same -	- same -

Other software coordinators

sub-system	Coordinator
iTPC (TPC?)	Irakli Chakaberia
Trigger	Akio Ogawa
DAQ	Jeff Landgraf
...

Run 20

Status of calibration timeline initialization

In RUN: EEMC, EMC, EPD, ETOF, GMT, TPC, MTD, TOF
Test: FST, FCS, STGC (no tables)
Desired init dates where announced to all software coordinators:

- Geometry tag has a timestamp of 20191120
- Simulation timeline [20191115,20191120[
- DB initialization for real data [20191125,...]

     Please initialize your table content appropriate yi.e.
sim flavor initial values are entered at 20191115 up to 20191119
(please exclude the edge),  ofl initial values at 20191125
(run starting on the 1st of December, even tomorrow's cosmic
and commissioning would pick the proper values).

Status - 2019/12/10

EMC = ready
ETOF = ready - initialized at 2019-11-25, no sim (confirming)
TPC = NOT ready [look at year 19 for comparison]
MTD = ready
TOF = Partially ready? INL correction, T0, TDC, status and alignement tables initialized
EPD = gain initialized at 2019-12-15 (!?), status not initialized, no sim

EEMC = ready? (*last init at 2017-12-20)
GMT = ready (*no db tables)

Status - 2019/12/09

EMC = ready
ETOF = ready? initialized at 2019-11-25, no sim
TPC = NOT ready
MTD = ready
TOF = NOT ready
EPD = gain initialized at 2019-12-15 (!?), status not initialized, no sim

EEMC = ready? (*last init at 2017-12-20)
GMT = ready (*no db tables)

Software coordinator feedback for Run 20 - Point of Contacts

Sub-system	Coordinator	Calibration POC	Online monitoring POC
MTD	Rongrong Ma	- same -	- same -
EMC EEMC	Raghav Kunnawalkam Elayavalli Nick Lukow	- same -	Note: L2algo, bemc and bsmdstatus
EPD	[ TBC]	- same -	- same -
BTOF	Frank Geurts	- same -	Frank Geurts Zaochen Ye
ETOF	Florian Seck	- same -	Florian Seck Philipp Weidenkaff
HLT	Hongwei Ke	- same -	- same -
TPC	Irakli Chakaberia	- same -	Flemming Videbaek
Trigger detectors	Akio Ogawa	- same -	- same -
DAQ	Jeff Landgraf	N/A

---

Run 21

Status of calibration timeline initialization

- Geometry tag has a timestamp of 20201215
- Simulation timeline [20201210, 20201215]
- DB initialization for real data [20201220,...]

Status - 2020/12/10

Software coordinator feedback for Run 21 - Point of Contacts

Sub-system	Coordinator	Calibration POC	Online monitoring POC
MTD	Rongrong Ma	- same -	- same -
EMC EEMC	Raghav Kunnawalkam Elayavalli Nick Lukow	- same -	Note: L2algo, bemc and bsmdstatus
EPD	Prashanth Shanmuganathan (TBC)	Skipper Kagamaster	- same -
BTOF	Zaochen	- same -	Frank Geurts Zaochen Ye
ETOF	Philipp Weidenkaff	- same -	Philipp Weidenkaff
HLT	Hongwei Ke	- same -	- same -
TPC	Yuri Fisyak	- same -	Flemming Videbaek
Trigger detectors	Akio Ogawa	- same -	- same -
DAQ	Jeff Landgraf	N/A
Forward Upgrade	Daniel Brandenburg	- same -	FCS - Akio Ogawa sTGC - Daniel Brandenburg FST - Shenghui Zhang/Zhenyu Ye

---

Run 22

Status of calibration timeline initialization

- Geometry tag has a timestamp of 20211015
- Simulation timeline [20211015, 20211020[
- DB initialization for real data [20211025,...]

Status - 2021/10/13

Software coordinator feedback for Run 22 - Point of Contacts (TBC)

Sub-system	Coordinator	Calibration POC	Online monitoring POC
MTD	Rongrong Ma	- same -	- same -
EMC EEMC	Raghav Kunnawalkam Elayavalli Navagyan Ghimire	- same -	Note: L2algo, bemc and bsmdstatus
EPD	Prashanth Shanmuganathan (TBC)	Skipper Kagamaster	- same -
BTOF	Zaochen	- same -	Frank Geurts Zaochen Ye
ETOF	Philipp Weidenkaff	- same -	Philipp Weidenkaff
HLT	Hongwei Ke	- same -	- same -
TPC	Yuri Fisyak	- same -	Flemming Videbaek
Trigger detectors	Akio Ogawa	- same -	- same -
DAQ	Jeff Landgraf	N/A
Forward Upgrade	Daniel Brandenburg	- same -	FCS - Akio Ogawa sTGC - Daniel Brandenburg FST - Shenghui Zhang/Zhenyu Ye

---

Run XIII

Meeting minutes
Database initialization check-list
Calibration requests POC
Online Monitoring POC

Preparation meeting minutes

Preparation meeting #1 (this included a summary of the You do not have access to view this node)
Preparation meeting #2 (this meeting was informal, meeting #3 bundled two weeks progress)
Preparation meeting #3 (in meeting #4 announcement)
Preparation meeting #4 (minutes were not sent)
Preparation meeting #5
Preparation meeting #6
Preparation meeting #7
Preparation meeting #8
Preparation meeting #9
Preparation meeting #10 (informally met with a few team members, no formal meeting)
Run support meeting, #11
Run support meeting, #12
Run support meeting #13 (minutes were not sent - all action items followed-up)
Run support meeting #14
Run support meeting #15
Run support meeting #16

Database initialization check list

TPC Software  – Richard Witt          NO
GMT Software  – Richard Witt          NO
EMC2 Software - Alice Ohlson          Yes
FGT Software  - Anselm Vossen         Yes
FMS Software  - Thomas Burton         Yes
TOF Software  - Frank Geurts          Yes
Trigger Detectors  - Akio Ogawa       ??
HFT Software  - Spyridon Margetis     NO (no DB interface, hard-coded values in preview codes)

Calibration Point of Contacts per sub-system

If a name is missing, the POC role falls onto the coordinator.

                Coordinator           Possible POC
                ------------          ---------------
TPC Software  – Richard Witt          
GMT Software  – Richard Witt          
EMC2 Software - Alice Ohlson          Alice Ohlson  
FGT Software  - Anselm Vossen         
FMS Software  - Thomas Burton         Thomas Burton    
TOF Software  - Frank Geurts          
Trigger Detectors  - Akio Ogawa       
HFT Software  - Spyridon Margetis     Hao Qiu

Online Monitoring POC

The final list from the SPin PWGC can be found at You do not have access to view this node . The table below includes the Spin PWGC feedback and other feedbacks merged.

Directories we inferred are being used (as reported in the RTS Hypernews)
scaler	Len Eun and Ernst Sichtermann (LBL)	This directory usage was indirectly reported
SlowControl	James F Ross (Creighton)
HLT	Qi-Ye Shou	The 2012 directory had a recent timestamp but owned by mnaglis. Aihong Tang contacted 2013/02/12 Answer from Qi-Ye Shou 2013/02/12 - will be POC.
fmsStatus	Yuxi Pan (UCLA)	This was not requested but the 2011 directory is being overwritten by user=yuxip FMS software coordinator contacted for confirmation 2013/02/12 Yuxi Pan confirmed 2013/02/13 as POC for this directory
Spin PWG monitoring related directories follows
L0trg	Pibero Djawotho (TAMU)
L2algo	Maxence Vandenbroucke (Temple)
cdev	Kevin Adkins (UKY)
zdc	Len Eun and Ernst Sichtermann (LBL)
bsmdStatus	Keith Landry (UCLA)
emcStatus	Keith Landry (UCLA)
fgtStatus	Xuan Li (Temple)	This directory is also being written by user=akio causing protection access and possible clash problems. POC contacted on 2013/02/08, both Akio and POC contacted again 2013/02/12 -> confirmed as OK.
bbc	Prashanth (KSU)

Run XIV

Meeting minutes
Pre run-preparation notes (basic checks, annoucements)
Database initialization check list
Calibration Point of Contacts per sub-system
Online Monitoring POC

Notes

2013/11/15
- Info gathering begins (directories/areas and Point of Contacts)
  Status:
  2013/11/22, directory structure, 2 people provided feedback, Renee coordinated the rest
  2013/11/25, calibration POC, 3 coordinators provided feedback - Closed 2013/12/04
  2013/12/04, geometry for Run 14,
- Basic check: CERT for online is old if coming from the Wireless
  Status: fixed at ITD level, 2013/11/18 - the reverse proxy did not have the proper CERT
2013/1125
- Send the request to software sub-system cooridnators for calibration POC for Run 14, deadline December 2nd.

Database initialization check list

This actions suggested by this section has not started yet.

Sub-system	Coordinator	Check done
DAQ	Jeff Landgraf
TPC	Richard Witt
GMT	Richard Witt
EMC2	Mike Skoby Kevin Adkins
FMS	Thomas Burton
TOF	Daniel Brandenburg
MTD	Rongrong Ma
HFT	Spiros Margetis	(not known)
Trigger	Akio Ogawa
FGT	Xuan Li

Calibration Point of Contacts per sub-system

"-" indicates no feedback was provided. But if a name is missing, the POC role falls onto the coordinator.

Sub-system	Coordinator	Calibration POC
DAQ	Jeff Landgraf	-
TPC	Richard Witt	-
GMT	Richard Witt	-
EMC2	Mike Skoby Kevn Adkins	-
FMS	Thomas Burton	-
TOF	Daniel Brandenburg	-
MTD	Rongrong Ma	Bingchu Huan
HFT	Spiros Margetis	Jonathan Bouchet


Trigger	Akio Ogawa	-
FGT	Xuan Li	N/A

Online Monitoring POC

~~scaler~~		Not needed 2013/11/25
SlowControl	Chanaka DeSilva	OKed on second Run preparation meeting
HLT	Zhengquia Zhang	Learn incidently on 2014/01/28
HFT	Shusu Shi	Learn about it on 2014/02/26
~~fmsStatus~~		Not needed 2013/11/25
L0trg	Zilong Chang Mike Skoby	Informed 2013/11/10 and created 2013/11/15
L2algo	Nihar Sahoo	Informed 2013/11/25
~~cdev~~		Not needed 2013/11/25
zdc		may not be used (TBC)
bsmdStatus	Janusz Oleniacz	Info will be passed from Keith Landry 2014/01/20 Possible backup, Leszek Kosarzewski 2014/03/26
emcStatus	Janusz Oleniacz	Info will be passed from Keith Landry 2014/01/20 Possible backup, Leszek Kosarzewski 2014/03/26
~~fgtStatus~~		Not needed 2013/11/25
bbc	Akio Ogawa	Informed 2013/11/15, created same day

Run XV

Run 15 was preapred essentiallydiscussing with indviduals and a comprehensive page not maintained.

Run XVI

This page will contain feedback related to the preparation of the online setup.

Notes

Online Monitoring POC

scaler
SlowControl
HLT	Zhengqiao	Feedback 2015/11/24
HFT	Guannan Xie	Spiros: Feedback 2015/11/24
~~fmsStatus~~		Akio: Possibly not needed (TBC). 2016/01/13 noted this was not used in Run 15 and wil probably never be used again.
fmsTrg		Confirmed neded 2016/01/13
fps		Akio: Not neded in Run 16? Perhaps later.
L0trg	Zilong Chang	Zilong: Feedback 2015/11/24
L2algo	Kolja Kauder	Kolja: will be POC - 2015/11/24
cdev	Chanaka DeSilva
zdc
bsmdStatus	Kolja Kauder	Kolja: will be POC - 2015/11/24
bemcTrgDb	Kolja Kauder	Kolja: will be POC - 2015/11/24
emcStatus	Kolja Kauder	Kolja: will be POC - 2015/11/24
~~fgtStatus~~		Not needed since Run 14 ... May drop from the list
bbc	Akio Ogawa	Feedback 2015/11/24, needed
rp

Calibration Point of Contacts per sub-system

Sub-system	Coordinator	Calibration POC
DAQ	Jeff Landgraf	-
TPC	Richard Witt Yuri Fisyak	-
GMT	Richard Witt	-
EMC2	Kolja Kauder Ting Lin	-
FMS	Oleg Eysser	-
TOF	Daniel Brandenburg	-
MTD	Rongrong Ma	(same confirmed 2015/11/24)
HFT	Spiros Margetis	Xin Dong
HLT	Hongwei Ke	(same confirmed 2015/11/24)
Trigger	Akio Ogawa	-
RP	Kin Yip	-

Database initialization check list

Shift Accounting

This page will now hold the shift accounting pages. They complement the Shift Sign-up process by documenting it.

Shift Dues and Special Requests Run 20
Run 19 special requests
Run 18 shift dues
You do not have access to view this node
You do not have access to view this node
Shift accounting, dues for Run 15 and findings
You do not have access to view this node
You do not have access to view this node
You do not have access to view this node
...

Run 18 shift dues

Dues

Run 18 Shift Dues & Notes

Period coordinators

As usual, period coordinators are pre-assigned, as arranged by the Spokespersons.

Special arrangements and requests

Under the family-related policy, the following 6 weeks of offline QA shifts were pre-assigned:
MAR 27 Kevin Adkins (Kentucky)
APR 03 Kevin Adkins
APR 10 Sevil Salur (Rutgers)
APR 17 Richard Witt (USNA/Yale)
MAY 22 Juan Romero (UC Davis)
JUN 12 Terry Tarnowsky (Michigan State)
Lanny Ray (UT Austin), as QA coordinator, always is pre-assigned the first QA week.
FIAS remains in “catch-up mode” and is taking extra shifts above their dues. Pre-assigned shifts can be requested in this scenario. FIAS has been pre-assigned 4 Detector Op shifts.
Bob Tribble (TAMU) requests the evening Shift leader slot during Apr 10-17.

Run 19 special requests

The following pre-assigned slot requests were made.

    9 WEEKS PRE-ASSIGNED QA AS FOLLOWS
    ==================================
    Lanny Ray (UT Austin) QA Mar 5
    Richard Witt (USNA/Yale) QA Mar 19
    Sevil Salur (Rutgers) QA Apr 16
    Wei Li (Rice) QA Apr 23
    Kevin Adkins (Kentucky) QA May 14
    Juan Romero (UC Davis) QA May 21
    Jana Bielcikova (NPI, Czech Acad of Sci) QA May 28  
    Yanfang Liu (TAMU) QA June 25 
    Yanfang Liu (TAMU) QA July 02
    
    8 WEEKS PRE-ASSIGNED REGULAR SHIFTS AS FOLLOWS
    ==================================
    Bob Tribble (BNL) Feb 05 SL evening 
    Daniel Kincses (Eotvos) Mar 12  DO Trainee Day
    Daniel Kincses (Eotvos) Mar 19  DO Day
    Mate Csanad (Eotvos) Mar 12 SC Day
    Ronald Pinter (Eotvos) Mar 19 SC Day
    Carl Gagliardi (TAMU)  May 14  SL day
    Carl Gagliardi (TAMU)  May 21 SL day 
    Grazyna Odyniec (LBNL) July 02 SL evening

Shift Dues and Special Requests Run 20

For the calculation of shift dues, there are two considerations.
1) The length of time of the various shift configurations (2 person, 4 person no trainees, 4 person with trainees, plus period coordinators/QA shifts)
2) The percent occupancy of the training shifts

For many years, 2) has hovered about 45%, which is what we used to calculate the dues. Since STAR gives credit for training shifts (as we should) this needs to be factored in or we would not have enough shifts.

The sum total of shifts needed are then divided by the total number of authors minus authors from Russian institutions who can not come to BNL.

date                  weeks           crew           training           PC           OFFLINE
11/26-12/10    2                  2                      0                  0           0
12/10-12/24    2                  4                      2 1            0
12/24-6/30      27                4                      2                 1            1
7/02-7/16        2                  4    0   1            1

Adding these together (3x a shift for crew, 3x45% for training, plus pc plus offline) gives a total of 522 shifts.
The total number of shifters is 303 - 30 Russian collaborators = 273 people
Giving a total due of 1.9 per author.

For a given institution, their load is calculated as # of authors - # of expert credits x due -> Set to an integer value as cutting collaborators into pieces is non-collegial behavior.

However, this year, this should have been:
date                  weeks           crew           training           PC           OFFLINE
11/26-12/10    2                  2                      0                  0           0
12/10-12/24    2                  4                      2 1            0
12/24-6/02      23                4                      2                 1            1
6/02-6/16        2                  4    0   1            1

Adding these together (3x a shift for crew, 3x45% for training, plus pc plus offline) gives a total of 456 shifts for a total due of 1.7 per author.

We allowed some people to pre-sign up, due to a couple different reasons.

Family reasons so offline QA:
James Kevin Adkins
Jana Bielčíková
Sevil Selur
Md. Nasim
Yanfang Liu

Additionally, Lanny Ray is given the first QA shift of the year as our experience QA shifter.

This year, to add an incentive to train for shift leader, we allowed people who were doing shift leader training to sign up for both their training shift and their "real" shift early:
Justin Ewigleben
Hanna Zbroszczyk
Jan Vanek
Maria Zurek
Mathew Kelsey
Kun Jiang
Yue-Hang Leung

Both Bob Tribble and Grazyna Odyniec sign up early for a shift leader position in recognition of their schedules and contributions

This year because of the date of Quark Matter and the STAR pre-QM meeting, several people were traveling on Tuesday during the sign up. These people I signed up early as I did not want to punish some of our most active colleagues for the QM timing:
James Daniel Brandenburg
Sooraj Radhakrishnan

3 other cases that were allowed to pre-sign up:
Panjab University had a single person who had the visa to enter the US, and had to take all of their shifts prior to the end of their contract in March. So that the shifter could have some spaces in his shifts for sanity, I signed up:
Jagbir Singh
Eotvos Lorand University stated that travel is complicated for their group, and so it would be good if they could insure that they were all on shift at the same time. Given that they are coming from Europe I signed up:
Mate Csanad
Daniel Kincses
Roland Pinter
Srikanta Tripathy
Frankfurt Institute for Advanced Studies (FIAS) wanted to be able to bring Masters students to do shift, but given the training requirements and timing with school and travel for Europe, this leaves little availability for shift. So I signed up:
Iouri Vassiliev
Artemiy Belousov
Grigory Kozlov

Tools

This is to serve as a repository of information about various STAR tools used in experimental operations.

Implementing SSL (https) in Tomcat using CA generated certificates

The reason for using a certificate from a CA as opposed to a self-signed certificate is that the browser gives a warning screen and asks you to except the certificate in the case of a self-signed certificate. As there already exists a given list of trusted CAs in the browser this step is not needed.

The following list of certificates and a key are needed:

/etc/pki/tls/certs/wildcard.star.bnl.gov.Nov.2012.cert – host cert.
/etc/pki/tls/private/wildcard.star.bnl.gov.Nov.2012.key – host key (don’t give this one out)
/etc/pki/tls/certs/GlobalSignIntermediate.crt – intermediate cert.
/etc/pki/tls/certs/GlobalSignRootCA_ExtendedSSL.crt –root cert.
/etc/pki/tls/certs/ca-bundle.crt – a big list of many cert.

Concatenate the following certs into one file in this example I call it: Global_plus_Intermediate.crt

cat /etc/pki/tls/certs/GlobalSignIntermediate.crt > Global_plus_Intermediate.crt
cat /etc/pki/tls/certs/GlobalSignRootCA_ExtendedSSL.crt >> Global_plus_Intermediate.crt
cat /etc/pki/tls/certs/ca-bundle.crt >> Global_plus_Intermediate.crt

Run this command. Note that “-name tomcat” and “-caname root” should not be changed to any other value. The command will still work but will fail under tomcat. If it works you will be asked for a password, that password should be set to "changeit".

 openssl pkcs12 -export -in wildcard.star.bnl.gov.Nov.2012.cert -inkey wildcard.star.bnl.gov.Nov.2012.key -out mycert.p12 -name tomcat -CAfile Global_plus_Intermediate.crt -caname root -chain

Test the new p12 output file with this command:

keytool -list -v -storetype pkcs12 -keystore mycert.p12

Note it should say: "Certificate chain length: 3"

In tomcat’s the server.xml file add a connector that looks like this:

<Connector port="8443" protocol="HTTP/1.1" SSLEnabled="true"
           maxThreads="150" scheme="https" secure="true"
           keystoreFile="/home/lbhajdu/certs/mycert.p12" keystorePass="changeit"
           keystoreType="PKCS12" clientAuth="false" sslProtocol="TLS"/>

Note the path should be set to the correct path of the certificate. And the p12 file should only be readable by the Tomcat account because it holds the host key.

Online Linux pool

March 15, 2012:

THIS PAGE IS OBSOLETE! It was written as a guide in 2008 for documenting improvements in the online Linux pool, but has not been updated to reflect additional changes to the state of the pool, so not all details are up to date.

One particular detail to be aware of: the name of the pool nodes is now onlNN.starp.bnl.gov, where 01<=NN<=14. The "onllinuxN" names were retired several years ago.

Historical page (circa 2008/9):

Online Linux pool for general experiment support needs

GOAL:

Provide a Linux environment for general computing needs in support of the experiemental operations.

HISTORY (as of approximately June 2008):

A pool of 14 nodes, consisting of four different hardware classes (all circa 2001) has been in existence for several years. For the last three (or more?) years, they have had Scientific Linux 3.x with support for the STAR software environment, along with access to various DAQ and Trigger data sources. The number of significant users has probably been less than 20, with the heaviest usage related to L2. User authentication was originally based on an antique NIS server, to which we had imported the RCF accounts and passwords. Though still alive, we have not kept this NIS information maintained over time. Over time, local accounts on each node became the norm, though of course this is rather tedious. Home directories come in three categories: AFS, NFS on onllinux5, and local home directories on individual nodes. Again, this gets rather tedious to maintain over time.

There are several "special" nodes to be aware of:

Three of the nodes (onllinux1, 2 and 3) are in the Control Room for direct console login as needed. (The rest are in the DAQ room.)
onllinux5 has the NFS shared home directories (in /online/users). (NB. /online/users is being backed up by the ITD Networker backup system.)
onllinux6 is (was?) used for many online database maintenance scripts (check with Mike DePhillps about this -- we had planned to move these scripts to onldb).
onllinux1 was configured as an NIS slave server, in case the NIS master (starnis01) fails.

PLAN:

For the run starting in 2008 (2009?), we are replacing all of these nodes with newer hardware.

The basic hardware specs for the replacement nodes are:

Dual 2.4 GHZ Intel Xeon processors

1GB RAM

2 x 120 GB IDE disks

These nodes should be configured with Scientific Linux 4.5 (or 4.6 if we can ensure compatibility with STAR software) and support the STAR software environment.

They should have access to various DAQ and Trigger NFS shares. Here is a starter list of mounts:

Shared DAQ and Trigger resources

SERVER	DIRECTORY on SERVER	LOCAL MOUNT PONT	MOUNT OPTIONS
evp.starp	/a	/evp/a	ro
evb01.starp	/a	/evb01/a	ro
evb01	/b	/evb01/b	ro
evb01	/c	/evb01/c	ro
evb01	/d	/evb01/d	ro
evb02.starp	/a	/evb02/a	ro
evb02	/b	/evb02/b	ro
evb02	/c	/evb02/c	ro
evb02	/d	/evb02/d	ro
daqman.starp	/RTS	/daq/RTS	ro
daqman	/data	/daq/data	rw
daqman	/log	/daq/log	ro
trgscratch.starp	/data/trgdata	/trg/trgdata	ro
trgscratch.starp	/data/scalerdata	/trg/scalerdata	ro
startrg2.starp	/home/startrg/trg/monitor/run9/scalers	/trg/scalermonitor	ro
online.star	/export	/onlineweb/www	rw

WISHLIST Items with good progress:

<Uniform and easy to maintain user authentication system to replace the current NIS and local account mess. Either a local LDAP, or a glom onto RCF LDAP seems most feasible> -- An ldap server (onlldap.starp.bnl.gov) has been set-up and the 15 onllinux nodes are authenticating to it *BUT* it is using NIS!

<Shared home directories across the nodes with backups> -- onlldap is also hosting the home directories and sharing them via NFS. EMC Networker is backing up the home directories and Matt A. is recieving the email notifications.

<Integration into SSH key management system (mechanism depends upon user authentication method(s) selected).> -- The ldap server has been added to the STAR SSH key management system, and users are able to login to the new onlXX nodes with keys now.

<Common configuration management system> -- Webmin is in use.

<Ganglia monitoring of the nodes> -- I think this is done...

<Osiris monitoring of the nodes> -- I think this is done - Matt A. and Wayne B. are receiveing the notices...

WISHLIST Items still needing significant work:

None?

SSH Key Management

Overview

An SSH public key management system has been developed for STAR (see D. Arkhipkin et al 2008 J. Phys.: Conf. Ser. 119 072005), with two primary goals stemming from the heightened cyber-security scrutiny at BNL:

Use of two-factor authentication for remote logins

Identification and management of remote users accessing our nodes (in particular, the users of "group" accounts which are not tied to one individual) and achieve accountability

A benefit for users also can be seen in the reduction in the number of passwords to remember and type.

In purpose, this system is similar to the RCF's key management system, but is somewhat more powerful because of its flexibility in the association of hosts (client systems), user accounts on those clients, and self-service key installation requests.

Here is a typical scenario of the system usage:

A sysadmin of a machine named FOO creates a user account named "JDOE" and, if not done already, installs the key_services client.
A user account 'JDOE' on host 'FOO' is configured in the Key Management system by a key management administrator.
John Doe uploads (via the web) his or her public ssh key (in openssh format).
John Doe requests (via the web) that his key be added to JDOE's authorized_keys file on FOO.
A key management administrator approves the request, and the key_services client places the key in ~JDOE/.ssh/authorized_keys.

At this point, John Doe has key-based access to JDOE@FOO. Simple enough? But wait, there's more! Now John Doe realizes that he also needs access to the group account named "operator" on host BAR. Since his key is already in the key management system he has only to request that his key be added to operator@BAR, and voila (subject to administrator approval), he can now login with his key to both JDOE@FOO and operator@BAR. And if Mr. Doe should leave STAR, then an administrator simply removes him from the system and his keys are removed from both hosts.

Slightly Deeper...

There are three things to keep track of here -- people (and their SSH keys of course), host (client) systems, and user accounts on those hosts:

People want access to specific user accounts at specific hosts.

So the system maintains a list of user accounts for each host system, and a list of people associated with each user account at each host.
(To be clear -- the system does not have any automatic user account detection mechanism at this time -- each desired "user account@host" association has to be added "by hand" by an administrator.)

This Key Management system, as seen by the users (and admins), consists simply of users' web browsers (with https for encryption) and some PHP code on a web server (which we'll call "starkeyw") which inserts uploaded keys and user requests (and administrator's commands) to a backend database (which could be on a different node from the web server if desired).

Behind the scenes, each host that is participating in the system has a keyservices client installed that runs as a system service. The keyservices_client periodically (at five minute intervals by default) interacts a different web server (serving different PHP code that we'll call starkeyd). The backend database is consulted for the list of approved associations and the appropriate keys are downloaded by the client and added to the authorized_keys files accordingly.

In our case, our primary web server at www.star.bnl.gov hosts all the STAR Key Manager (SKM) services (starkeyw and starkeyd via Apache, and a MySQL database), but they could each be on separate servers if desired.

Perhaps a picture will help. See below for a link to an image labelled "SKMS in pictures".

Deployment Status and Future Plans

We have begun using the Key Management system with several nodes and are seeking to add more (currently on a voluntary basis). Only RHEL 3/4/5 and Scientific Linux 3/4/5 with i386 and x86_64 kernels have been tested, but there is no reason to believe that the client couldn't be built on other Linux distributions or even Solaris. We do not anticipate "forcing" this tool onto any detector sub-systems during the 2007 RHIC run, but we do expect it (or something similar) to become mandatory before any future runs. Please contact one of the admins (Wayne Betts, Jerome Lauret or Mike Dephillips) if you'd like to volunteer or have any questions.

User access is currently based on RCF Kerberos authentication, but may be extended to additional authentication methods (eg., BNL LDAP) if the need arises.

Client RPMs (for some configurations) and SRPM's are available, and some installation details are available here:

http://www.star.bnl.gov/~dmitry/skd_setup/

An additional related project is the possible implementation of a STAR ssh gateway system (while disallowing direct login to any of our nodes online) - in effect acting much like the current ssh gateway systems role in the SDCC. Though we have an intended gateway node online (stargw1.starp.bnl.gov, with a spare on hand as well), it's use is not currently required.

Anxious to get started?

Here you go: https://www.star.bnl.gov/starkeyw/

You can use your RCF username and Kerberos password to enter.

When uploading keys, use your SSH public keys - they need to be in OpenSSH format. If not, please consult SSH Keys and login to the SDCC.

Online Computing

General

Detector sub-systems operation procedures - Updated 2008, requested confirmation for 2009

Online computing run preparation plans

Run 19

Feedback from software coordinators

Run 20

Status of calibration timeline initialization

Status - 2019/12/10

Status - 2019/12/09

Software coordinator feedback for Run 20 - Point of Contacts

Run 21

Status of calibration timeline initialization

Status - 2020/12/10

Software coordinator feedback for Run 21 - Point of Contacts

Run 22

Status of calibration timeline initialization

Software coordinator feedback for Run 22 - Point of Contacts (TBC)

Run XIII

Preparation meeting minutes

Database initialization check list

Calibration Point of Contacts per sub-system

Online Monitoring POC

Run XIV

Preparation meeting meetings, links

Notes

Database initialization check list

Calibration Point of Contacts per sub-system

Online Monitoring POC

Run XV

Run XVI

Notes

Online Monitoring POC

Calibration Point of Contacts per sub-system

Database initialization check list

Shift Accounting

Run 18 shift dues

Run 18 Shift Dues & Notes

Run 19 special requests

Shift Dues and Special Requests Run 20

Tools

Implementing SSL (https) in Tomcat using CA generated certificates

Online Linux pool

March 15, 2012:

THIS PAGE IS OBSOLETE! It was written as a guide in 2008 for documenting improvements in the online Linux pool, but has not been updated to reflect additional changes to the state of the pool, so not all details are up to date.

Online Linux pool for general experiment support needs

SSH Key Management

Overview

Slightly Deeper...

Deployment Status and Future Plans

Anxious to get started?