JES2 not purging output on PURGE queue

Discussion:

(too old to reply)

Peter Hunkeler

2015-01-06 15:32:34 UTC

I'm about to open a PMR but just thought I might overlook someting.

This is happening on a z/OS V2.1 JES2 parallel sysplex with 4 systems, two work systems and two GDPS controlling images. Both work systems have been IPLed recently (December 30 and 31).

We have the need to take spool volume SPL902 offline and have drained it. We're purging output elder than 21 days by issuing $POJQ,ALL,DAYS>21 on a daily basis. Today, I wondering that the spool volume still was quite full despite the fact that I can see only few smaller outputs in the hold and output queues (SDSF). the I found more than 1000 jobs are still in the PURGE queue, awaiting purge. JES2 does not seem to be purging and I cannot find out why. When I try to purge jobs on the purge queue, some are actually purge some not. I cannot find out why some jobs stay on the queue.

I could go on purging the jobs manually, but I would rather like to understand why purging is stuck. I'd appreciate any hint what to lock at or what to try. Below is the output of some commands (display and purge) I issued with some comments inbetween.
--
Peter Hunkeler

Status of the spool volumes as shown by $DSPL,LONG

$HASP893 VOLUME(SPL901)
$HASP893 VOLUME(SPL901) STATUS=ACTIVE,DSNAME=SYS1.HASPACE,
$HASP893 SYSAFF=(ANY),TGNUM=327275,
$HASP893 TGINUSE=18901,TRKPERTGB=3,PERCENT=5,
$HASP893 RESERVED=NO,MAPTARGET=NO
$HASP893 VOLUME(SPL902)
$HASP893 VOLUME(SPL902) STATUS=DRAINING,AWAITING(JOBS),
$HASP893 DSNAME=SYS1.HASPACE,SYSAFF=(ANY),
$HASP893 TGNUM=349335,TGINUSE=65497,TRKPERTGB=3,
$HASP893 PERCENT=18,RESERVED=NO,MAPTARGET=NO
$HASP646 5.7752 PERCENT SPOOL UTILIZATION

The status of the PURGE PCEs is as follows:

$DPCE(PURGE),LONG,DET

$HASP653 PCE(PURGE)
$HASP653 PCE(PURGE) NAME=PURGE,WAIT=PURGE,INHIBIT=NO,
$HASP653 MOD=HASPTRAK,SEQ=17246200,
$HASP653 TIME=(2015.006,13:24:06.264660),
$HASP653 ACTIVE=0,I/O=0,
$HASP653 NAME=PURGE,WAIT=PURGE,INHIBIT=NO,
$HASP653 MOD=HASPTRAK,SEQ=17246200,
$HASP653 TIME=(2015.006,13:24:06.264700),
$HASP653 ACTIVE=0,I/O=0

It seems the PCEs are waiting for work, but none is assigned even though the purge queue over 1000 jobs long.

Here is the result of a $D command for two jobs on the purge queue. Some jobs show the additonal information PURGE=YES,CANCEL=YES, some not (What does this tell me?)

$DS(489149)
$HASP890 JOB(BMCDBC)
$HASP890 JOB(BMCDBC) STATUS=(AWAITING PURGE),CLASS=STC,
$HASP890 PRIORITY=1,SYSAFF=(IND,ANY),HOLD=(NONE)

$DS(496983)
$HASP890 JOB(C2PACMON)
$HASP890 JOB(C2PACMON) STATUS=(AWAITING PURGE),CLASS=STC,
$HASP890 PRIORITY=1,SYSAFF=(IND,ANY),
$HASP890 HOLD=(NONE),PURGE=YES,CANCEL=YES

A $DS...,LONG does not show anything surprising:

$DS(496983),LONG
$HASP890 JOB(C2PACMON)
$HASP890 JOB(C2PACMON) STATUS=(AWAITING PURGE),CLASS=STC,
$HASP890 PRIORITY=1,SYSAFF=(IND,ANY),
$HASP890 HOLD=(NONE),PURGE=YES,CANCEL=YES,
$HASP890 CMDAUTH=(LOCAL),OFFS=(),SECLABEL=,
$HASP890 USERID=C2PSUSER,SPOOL=(VOLUMES=(SPL901,
$HASP890 2),TGS=2,PERCENT=0.0002),ARM_ELEMENT=NO,
$HASP890 CARDS=2,REBUILD=NO,CC=(COMPLETED,RC=0),
$HASP890 DELAY=(),CRTIME=(2014.342,12:21:43)

$DS(493612),LONG
$HASP890 JOB(BMCDBC)
$HASP890 JOB(BMCDBC) STATUS=(AWAITING PURGE),CLASS=STC,
$HASP890 PRIORITY=1,SYSAFF=(IND,ANY),HOLD=(NONE),
$HASP890 CMDAUTH=(LOCAL),OFFS=(),SECLABEL=,
$HASP890 USERID=TECBMC01,SPOOL=(VOLUMES=(SPL902),
$HASP890 TGS=5,PERCENT=0.0007),ARM_ELEMENT=NO,
$HASP890 CARDS=2,REBUILD=NO,CC=(COMPLETED,RC=0),
$HASP890 DELAY=(),CRTIME=(2014.337,11:32:39)

Lizette Koehler

2015-01-06 15:52:18 UTC

Permalink

Why purge Spool? Why not use the MIGRATION of spool volumes?

I hear it works very well for moving data from one spool volume to another.

Check out $MSPL commands

http://www-03.ibm.com/systems/z/os/zos/features/jes2/

Lizette

-----Original Message-----
From: JES2 discussion group [mailto:JES2-***@listserv.vt.edu] On Behalf Of
Peter Hunkeler
Sent: Tuesday, January 06, 2015 8:21 AM
To: JES2-***@LISTSERV.VT.EDU
Subject: JES2 not purging output on PURGE queue

I'm about to open a PMR but just thought I might overlook someting.

This is happening on a z/OS V2.1 JES2 parallel sysplex with 4 systems, two
w= ork systems and two GDPS controlling images. Both work systems have been
IPL= ed recently (December 30 and 31).

We have the need to take spool volume SPL902 offline and have drained it.
We= 're purging output elder than 21 days by issuing $POJQ,ALL,DAYS>21 on a
dail= y basis. Today, I wondering that the spool volume still was quite full
despi= te the fact that I can see only few smaller outputs in the hold and
output q= ueues (SDSF). the I found more than 1000 jobs are still in the
PURGE queue, = awaiting purge. JES2 does not seem to be purging and I cannot
find out why. = When I try to purge jobs on the purge queue, some are
actually purge some no= t. I cannot find out why some jobs stay on the
queue.

I could go on purging the jobs manually, but I would rather like to
understa= nd why purging is stuck. I'd appreciate any hint what to lock at
or what to = try. Below is the output of some commands (display and purge) I
issued with = some comments inbetween.

--
Peter Hunkeler

Status of the spool volumes as shown by $DSPL,LONG
=

$HASP893 VOLUME(SPL901)
$HASP893 VOLUME(SPL901) STATUS=3dACTIVE,DSNAME=3dSYS1.HASPACE,
$HASP893 SYSAFF=3d(ANY),TGNUM=3d327275,
$HASP893 TGINUSE=3d18901,TRKPERTGB=3d3,PERCENT=3d5,
$HASP893 RESERVED=3dNO,MAPTARGET=3dNO
$HASP893 VOLUME(SPL902)
$HASP893 VOLUME(SPL902) STATUS=3dDRAINING,AWAITING(JOBS),
$HASP893 DSNAME=3dSYS1.HASPACE,SYSAFF=3d(ANY),
$HASP893 TGNUM=3d349335,TGINUSE=3d65497,TRKPERTGB=3d3,
$HASP893 PERCENT=3d18,RESERVED=3dNO,MAPTARGET=3dNO
$HASP646 5.7752 PERCENT SPOOL UTILIZATION
=

=

The status of the PURGE PCEs is as follows:

=

=

$DPCE(PURGE),LONG,DET

$HASP653 PCE(PURGE)
$HASP653 PCE(PURGE) NAME=3dPURGE,WAIT=3dPURGE,INHIBIT=3dNO,
$HASP653 MOD=3dHASPTRAK,SEQ=3d17246200,
$HASP653 TIME=3d(2015.006,13:24:06.264660),
$HASP653 ACTIVE=3d0,I/O=3d0,
$HAS
$HASP653 NAME=3dPURGE,WAIT=3dPURGE,INHIBIT=3dNO,
$HASP653 MOD=3dHASPTRAK,SEQ=3d17246200,
$HASP653 TIME=3d(2015.006,13:24:06.264700),
$HASP653 ACTIVE=3d0,I/O=3d0

It seems the PCEs are waiting for work, but none is assigned even though
the= purge queue over 1000 jobs long.

Here is the result of a $D command for two jobs on the purge queue. Some
job= s show the additonal information PURGE=3dYES,CANCEL=3dYES, some not
(What do= es this tell me?)

$DS(489149)
$HASP890 JOB(BMCDBC)
$HASP890 JOB(BMCDBC) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),HOLD=3d(NONE)

$DS(496983)
$HASP890 JOB(C2PACMON)
$HASP890 JOB(C2PACMON) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),
$HASP890 HOLD=3d(NONE),PURGE=3dYES,CANCEL=3dYES

A $DS...,LONG does not show anything surprising:

$DS(496983),LONG
$HASP890 JOB(C2PACMON)
$HASP890 JOB(C2PACMON) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),
$HASP890 HOLD=3d(NONE),PURGE=3dYES,CANCEL=3dYES,
$HASP890 CMDAUTH=3d(LOCAL),OFFS=3d(),SECLABEL=3d,
$HASP890 USERID=3dC2PSUSER,SPOOL=3d(VOLUMES=3d(SPL901,
$HASP890 2),TGS=3d2,PERCENT=3d0.0002),ARM=5fELEMENT=3dNO,
$HASP890 CARDS=3d2,REBUILD=3dNO,CC=3d(COMPLETED,RC=3d0),
$HASP890 DELAY=3d(),CRTIME=3d(2014.342,12:21:43)

$DS(493612),LONG
$HASP890 JOB(BMCDBC)
$HASP890 JOB(BMCDBC) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),HOLD=3d(NONE),
$HASP890 CMDAUTH=3d(LOCAL),OFFS=3d(),SECLABEL=3d,
$HASP890 USERID=3dTECBMC01,SPOOL=3d(VOLUMES=3d(SPL902),
$HASP890 TGS=3d5,PERCENT=3d0.0007),ARM=5fELEMENT=3dNO,
$HASP890 CARDS=3d2,REBUILD=3dNO,CC=3d(COMPLETED,RC=3d0),
$HASP890 DELAY=3d(),CRTIME=3d(2014.337,11:32:39)

Lizette Koehler

2015-01-06 15:54:00 UTC

Permalink

Also check out

$da,xeq

You may have jobs still running on the spool. They only get removed when
they are no longer active.

Lizette

-----Original Message-----
From: Lizette Koehler [mailto:***@mindspring.com]
Sent: Tuesday, January 06, 2015 8:50 AM
To: 'JES2 discussion group'
Subject: RE: JES2 not purging output on PURGE queue

Why purge Spool? Why not use the MIGRATION of spool volumes?

I hear it works very well for moving data from one spool volume to another.

Check out $MSPL commands

http://www-03.ibm.com/systems/z/os/zos/features/jes2/

Lizette

-----Original Message-----
From: JES2 discussion group [mailto:JES2-***@listserv.vt.edu] On Behalf Of
Peter Hunkeler
Sent: Tuesday, January 06, 2015 8:21 AM
To: JES2-***@LISTSERV.VT.EDU
Subject: JES2 not purging output on PURGE queue

I'm about to open a PMR but just thought I might overlook someting.

This is happening on a z/OS V2.1 JES2 parallel sysplex with 4 systems, two
w= ork systems and two GDPS controlling images. Both work systems have been
IPL= ed recently (December 30 and 31).

We have the need to take spool volume SPL902 offline and have drained it.
We= 're purging output elder than 21 days by issuing $POJQ,ALL,DAYS>21 on a
dail= y basis. Today, I wondering that the spool volume still was quite full
despi= te the fact that I can see only few smaller outputs in the hold and
output q= ueues (SDSF). the I found more than 1000 jobs are still in the
PURGE queue, = awaiting purge. JES2 does not seem to be purging and I cannot
find out why. = When I try to purge jobs on the purge queue, some are
actually purge some no= t. I cannot find out why some jobs stay on the
queue.

I could go on purging the jobs manually, but I would rather like to
understa= nd why purging is stuck. I'd appreciate any hint what to lock at
or what to = try. Below is the output of some commands (display and purge) I
issued with = some comments inbetween.

--
Peter Hunkeler

Status of the spool volumes as shown by $DSPL,LONG
=

$HASP893 VOLUME(SPL901)
$HASP893 VOLUME(SPL901) STATUS=3dACTIVE,DSNAME=3dSYS1.HASPACE,
$HASP893 SYSAFF=3d(ANY),TGNUM=3d327275,
$HASP893 TGINUSE=3d18901,TRKPERTGB=3d3,PERCENT=3d5,
$HASP893 RESERVED=3dNO,MAPTARGET=3dNO
$HASP893 VOLUME(SPL902)
$HASP893 VOLUME(SPL902) STATUS=3dDRAINING,AWAITING(JOBS),
$HASP893 DSNAME=3dSYS1.HASPACE,SYSAFF=3d(ANY),
$HASP893 TGNUM=3d349335,TGINUSE=3d65497,TRKPERTGB=3d3,
$HASP893 PERCENT=3d18,RESERVED=3dNO,MAPTARGET=3dNO
$HASP646 5.7752 PERCENT SPOOL UTILIZATION
=

=

The status of the PURGE PCEs is as follows:

=

=

$DPCE(PURGE),LONG,DET

$HASP653 PCE(PURGE)
$HASP653 PCE(PURGE) NAME=3dPURGE,WAIT=3dPURGE,INHIBIT=3dNO,
$HASP653 MOD=3dHASPTRAK,SEQ=3d17246200,
$HASP653 TIME=3d(2015.006,13:24:06.264660),
$HASP653 ACTIVE=3d0,I/O=3d0,
$HAS
$HASP653 NAME=3dPURGE,WAIT=3dPURGE,INHIBIT=3dNO,
$HASP653 MOD=3dHASPTRAK,SEQ=3d17246200,
$HASP653 TIME=3d(2015.006,13:24:06.264700),
$HASP653 ACTIVE=3d0,I/O=3d0

It seems the PCEs are waiting for work, but none is assigned even though
the= purge queue over 1000 jobs long.

Here is the result of a $D command for two jobs on the purge queue. Some
job= s show the additonal information PURGE=3dYES,CANCEL=3dYES, some not
(What do= es this tell me?)

$DS(489149)
$HASP890 JOB(BMCDBC)
$HASP890 JOB(BMCDBC) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),HOLD=3d(NONE)

$DS(496983)
$HASP890 JOB(C2PACMON)
$HASP890 JOB(C2PACMON) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),
$HASP890 HOLD=3d(NONE),PURGE=3dYES,CANCEL=3dYES

A $DS...,LONG does not show anything surprising:

$DS(496983),LONG
$HASP890 JOB(C2PACMON)
$HASP890 JOB(C2PACMON) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),
$HASP890 HOLD=3d(NONE),PURGE=3dYES,CANCEL=3dYES,
$HASP890 CMDAUTH=3d(LOCAL),OFFS=3d(),SECLABEL=3d,
$HASP890 USERID=3dC2PSUSER,SPOOL=3d(VOLUMES=3d(SPL901,
$HASP890 2),TGS=3d2,PERCENT=3d0.0002),ARM=5fELEMENT=3dNO,
$HASP890 CARDS=3d2,REBUILD=3dNO,CC=3d(COMPLETED,RC=3d0),
$HASP890 DELAY=3d(),CRTIME=3d(2014.342,12:21:43)

$DS(493612),LONG
$HASP890 JOB(BMCDBC)
$HASP890 JOB(BMCDBC) STATUS=3d(AWAITING PURGE),CLASS=3dSTC,
$HASP890 PRIORITY=3d1,SYSAFF=3d(IND,ANY),HOLD=3d(NONE),
$HASP890 CMDAUTH=3d(LOCAL),OFFS=3d(),SECLABEL=3d,
$HASP890 USERID=3dTECBMC01,SPOOL=3d(VOLUMES=3d(SPL902),
$HASP890 TGS=3d5,PERCENT=3d0.0007),ARM=5fELEMENT=3dNO,
$HASP890 CARDS=3d2,REBUILD=3dNO,CC=3d(COMPLETED,RC=3d0),
$HASP890 DELAY=3d(),CRTIME=3d(2014.337,11:32:39)

Peter Hunkeler

2015-01-06 17:01:12 UTC

Permalink

Why purge Spool? Why not use the MIGRATION of spool volumes?

Ther was no hurry and so I just drained the volume to be emptied. The daily $POJQ... for output elder than 21 days would empty the volume over time.

You may have jobs still running on the spool. They only get removed when they are no longer active.

True, but I drained the volume two weeks ago, and the systems have been IPLed end of December. Therefore no executing jobs should have spool space allocations on the drained volume.

Anyway, I would expect JES2 to get rid of outut that is on the PURGE queue (oh, BTW, there are also jobs on the purge queue which have spool space allocations on volumes other than the drained one).

--
Peter Hunkeler

Tom Wasik

2015-01-06 17:59:00 UTC

Permalink

I noticed that the jobs are in independent mode ... SYSAFF=(IND.ANY). Is your member set to independent mode (ie $D MEMBER shows IND=YES?). If not, this is likely your problem. Try a $TJ command to remove independent mode from the job (SYSAFF=(-IND)) and see if the job purges. Jobs that are marked independent can only be processing on member that are in independent mode.

You could change the member to independent mode but then any work that enters the system while it is in independent mode will be marked as independent mode (creating more problem jobs).

Tom Wasik
JES2 Development

Peter Hunkeler

2015-01-06 18:12:39 UTC

Permalink

Post by Tom Wasik
I noticed that the jobs are in independent mode ... SYSAFF=(IND.ANY). Is your member set to independent mode (ie $D MEMBER shows IND=YES?). If not, this is likely your problem. Try a $TJ command to remove independent mode from the job (SYSAFF=(-IND)) and see if the job purges. Jobs that are marked independent can only be processing on member that are in independent mode.

I saw the independent mode indicator, and thought I should have a look if this might be the indication I'm looking for. Obviously my aging brain forgot to remember me ;-)

I'll check tomorrow and will post the result. Thanks for the hint.

--
Peter Hunkeler

Peter Hunkeler

2015-01-07 07:42:59 UTC

Permalink

Post by Tom Wasik
I noticed that the jobs are in independent mode ... SYSAFF=(IND.ANY). Is your member set to independent mode (ie $D MEMBER shows IND=YES?). If not, this is likely your problem.

No member currently is in independent mode. I did a $TJOBQ,Q=PURGE,SYSAFF=-IND and surprise, surprise, the output is being purged. Thanks Tom.

I'll have to find out why those jobs were assigned SYSAFF=(IND,ANY). Since the dates are wide spread, it must happen every now and then.... Something is wrong in our setup.

--

Peter Hunkeler