You are not logged in.
For SpaceNut re #1625
It will be interesting to see if you can set in motion a procedure to recover lost posts.
However, that is something for next year. WBA is ** really ** bogging down as it reaches back into the archive.
It is working the range from 2001 to 3000, and the average time per post is up to over a minute. I can see the Smart Wait counting as the FluxBB server digs around in the database for another ancient post.
The Smart Wait limit is currently set at 60 seconds, but we are quite some distance from ** that ** value.
OK re staying on course ... I'm working on code to give you a target list of ** just ** Admin posts that need work.
(th)
Offline
For SpaceNut re Post Repair initiative ...
WBA completed a scan from 2001 to 3000 .... the average time per looking reached 68 seconds. That's amazing.
However, despite the obvious struggle on the part of the FluxBB server, it withstood the burden and finished the job.
Total Command Lines found: 23
Total input Lines in script: 75Number of ID's processed: 1000
Starting Number: 2001
Last Number of Run: 3000
Summary for Web Automation Report for 09-27-2021 at 09:07:09
Average time of Loop from Main form: 00:01:08Total time of Processing: 19:01:16
Total time Program was Active: 23:42:13
EOJ Msg#1: Successful Run. No Warning Errors.
The scan of 1000 posts took just over 19 hours.
I'll start the scan from 3001 immediately after saving the output of this run.
(th)
Offline
For SpaceNut ....
Please ask Mr. Burk to rebuild the forum database indexes.
The Post ID ** is ** an index.
It is now taking 71 seconds to perform a cycle. Normal would be 26 seconds.
That should most definitely **not** be happening.
All lookups for a primary key should be nearly instantaneous.
A possible explanation is that the indexes are fragmented and possibly distributed over multiple hardware devices.
The run from 3001-4000 started out at 71 seconds, and it is likely to grow throughout the day.
At 71 seconds, 1000 lookups would take 20 hours, and if the time required continues to grow the job may take more than a day.
A consideration is that the data itself may be the cause of the delay.
It may be resident in archival storage, since it is on the order of 20 years old.
That could be a reason for the long retrieval time.
The index may be fine, but the data would necessarily reside in another location.
In any case, this job is going to take longer than anticipated if there is no way to improve system performance.
In case you pass this along to Mr. Burk, here is the command being executed:
Link to Post: http://newmars.com/forums/edit.php?id=#### (eg, 3001)
(th)
Offline
The index and post count are off and during that time the posts while in the same topic name were not in the same folders as they are now.
Offline
For SpaceNut re #1629
The time required to perform a lookup is increasing slowly but steadily.
Your mention of index and post counts being off is not reassuring.
Are you planning to ask Mr. Burk to take a look at the situation?
Should I stop these runs, or continue them even though they are slowing down?
The laptop doesn't care, and apparently the server for Mars Society is OK with this activity.
it doesn't balk. It is just slowing down.
The time required to fit 1000 lookups into 24 hours is 1 minute and 26 seconds. The time was 1 minute and 17 seconds the last time I looked.
If you are not familiar with databases, this behavior is NOT right! A call on a primary index should be practically instantaneous, for any record in the index.
If the data I'm requesting is on archival (ie, slow) storage, that might explain the slow response, but the change of time is what is concerning to me.
(th)
Offline
For SpaceNut ... the scan of Posts from 3001-4000 just ended ... Average time ended up at 1 minute 31 seconds.
That means a run will take more than 24 hours. A run would take exactly 24 hours if the average were 1;26.
Completed Sequence for ID: 4000
Total Command Lines found: 23
Total input Lines in script: 75Number of ID's processed: 1000
Starting Number: 3001
Last Number of Run: 4000
Summary for Web Automation Report for 09-28-2021 at 10:52:48
Average time of Loop from Main form: 00:01:31Total time of Processing: 25:17:43
Total time Program was Active: 25:36:53
EOJ Msg#1: Successful Run. No Warning Errors.
Just for experiment, I'm planning to set up the next test at 80000, so see if the time required for lookups is greater there.
Update at 12:14 local time ...
The run at 81000 started at 1:27, and increased to 1:37 after a few cycles.
The time required is greater than we saw at 4000, but NOT dramatically worse.
There may be something else going on .... FluxBB may be logging all traffic on the server.
If ** that ** is the case, and if it takes more and more time to perform logging, then the increase in time will NOT depend upon the part of the index in work, but instead on the total of all the accesses required to cover the range from 1 to 81000.
One thing I was glad to see was that no bandits appeared in the first reports from the run at 81000.
That would imply SpaceNut's estimate of where the upper limit of bandit population might be is close to the mark.
(th)
Offline
For SpaceNut .... re Post Scan for bandits...
The problem of individual scans taking too long seems to have solved itself ...
The test run at 80001 halted at 80234 with a time-out.
Completed Sequence for ID: 80233
There are admin posts.
No post_uid incidents were foundCompleted Sequence for ID: 80233
Starting Sequence for ID: 80234
Total Command Lines found: 23
Total input Lines in script: 75Number of ID's processed: 234
Starting Number: 80000
Last Number of Run: 81000
Summary for Web Automation Report for 09-28-2021 at 19:59:50
Average time of Loop from Main form: 00:01:42Total time of Processing: 06:38:32
Rather than continue scanning, I'll concentrate on developing code to deal with the new challenge.
I've got plenty of data to work with.
Update at 20:56 local time ...
I studied the script, removed an unneeded information only section, and the time improved.
The post lookup seems to be holding at 23/24 seconds.
I'll try a 1000 run tomorrow, and see how it goes.
(th)
Offline
For SpaceNut re Posts Repair initiative ..
A test run of an adjusted script is in progress. The adjustment was to remove a step that I had added to try to offer a break between Post requests.
The additional step appears to have done (a lot) more harm than good. The revised script is showing 55 seconds per cycle after 68 cycles. If it sustains that performance it will complete a run to 5000 in about 16 hours.
If we can sustain a pace of 1000 Posts scanned per day, we can complete this process by the end of the year.
Update at 12:37 local time: WBA just crossed 200 post lookups ... cycle time increased from 57 seconds to 59.
That would be an increase of 1 second per 100 cycles. If the FluxBB server can maintain that pace, we can reach the goal of 1000 posts in 24 hours.
The total (if the current rate of increase holds) would be 57+10 or 67 seconds, which is well under the 1:26 for a full 24 hours.
Meanwhile, I am starting to catch up on preliminary analysis of the data captured to date.
Output of the preliminary analysis will be: 1) Range of posts (2) Number of Admins found (3) Number of bandits found
We have established the upper bound for the project. It appears that 80,000 is beyond the end of the region of interest.
If we can hold to a pace of 1000 post lookups per day, we can finish the scan before the end of 2021.
Processing the actual posts is a separate activity. The current runs are NOT performing updates. They are collecting data.
Update at 13:53 local time....
The run from 4001-5000 is continuing to hold at a modest rate of slowing ...
With 269 posts scanned, cycle time is up to 60 from a start at 55 seconds.
If FluxBB can maintain that rate of increase, at the end of 1000 scans, the total delay would be 18.6 seconds.
That would be just ** under ** the 24 hour target of 1 minute 26 seconds.
(th)
Offline
For kbd512,
Glad to see the Charger is back home!
I hope it holds up for a number of years, or as long as you need it, which ever comes first.
***
Can I interest you in the TestID updates initiative?
Your demonstrations of the (to me impressive) software you found suggest you can polish off the remaining banned TestiD's fairly quickly.
There will be no more added to the pool. That project has come to an end, with a grand total of 18,344 TestID's recovered from our Russian and other global spammer friends.
You left off at 10,900 (as I recall) ...
(th)
Offline
For SpaceNut .... preview of coming attractions ....
No one else could possibly be interested in this post, and I am not even sure of you !!!
****
Inventory of Todo Items NewMars Posts Repair Initiative
2021/09/29 4001-5000 Admins
2021/09/27 3001-4000 Admins 29 Bandits 858
2021/09/26 2001-3000 Admins 15 Bandits 622
2021/09/25 1001-2000 Admins 79 Bandits 388
2021/09/22 101-1000 Admins 103 Bandits 114
2021/09/21 001-100 Admins 0 post_uid 21
The summary above shows very precisely the workload ahead for the early posts in the Posts Repair initiative.
I'm planning to write code and script to look at the Admin posts to find the ones with bandits.
This script would run in guest mode, so blocks will not occur. The only unknown at this point is whether a guest can see the bandits.
I'm assuming that will be the case, because (I'm hoping) FluxBB will treat the bandits as text and show them to the public.
The posts that contain bandits will (in the fullness of time) be fed into an update procedure that is still just a sketch on a paper napkin.
All I've accomplished on ** that ** front is to firm up the parameter passing procedure for the situation at hand.
The Script command will pass: (a) the text to be deleted and (b) the character at which deletion will stop (eg, ])
However, only the Help text has been updated at this point.
Update at 19:27 local time ....
I am increasingly encouraged .... at a count of 558 cycles out of a planned 1000, the cycle time has increased from 55 seconds to 63 seconds.
That is less than two seconds increase per 100 cycles. If the current rate of 1.44 seconds increase per 100 cycles holds, the final value will be: 69.6.
That is well under the value for 1000 cycles in 24 hours, of 86.4 seconds.
(th)
Offline
So in the edit id of post 1 -100 you indicate there was 21 incidents of artifacts present...
Did those get fixed?
edit I checked post id 18 and removed 4 incidents that it contained :post_uid3
Offline
For SpaceNut re #1636
You asked if I had fixed any artifacts....
I ** did ** manually adjust a few posts, so I could understand what the program needs to do.
At the time, I reported my work. You double checked the work, and reported correcting one, but you did not tell me what you fixed, so i cannot tell the program what to do, because you did not tell me.
The program is not written for this new activity. Only the Help button has been updated to show what the new script command will look like.
For that reason ** none ** of the artifacts have been corrected by the program.
What turns out to be a higher priority is what I am doing ....
The read-only script is accumulating a list of Admin posts (a) and (b) a list of posts that contain artifacts.
I am planning to write a new function for the Extract program that takes in the output of the script, and delivers a list of posts that need work. The new function will deliver two lists:
1) A list of posts by admins
2) A list of posts by members that contain artifacts
The first list will be fed into another script (to be written) that will identify Admins that contain artifacts.
I am hoping that if I run the script ** WITHOUT ** being logged in, that FluxBB will show the artifacts if there are any
The output of ** that ** script would be fed into an extract that ** should ** yield a list of posts by Admins that contain artifacts.
The second list will be fed into a new script that will employ a new function (not yet written) to remove artifacts from member posts.
The scan that is running now is a read-only script that will deliver a list of posts by Admins (a) and (b) posts that contain artifacts.
I am collecting the output of the scan runs, and will eventually process them to yield the outputs described above.
Running the scans will take about three months, if the current script holds up against the slow increase in run time that I have reported to you previously.
Somewhere along the line, as time permits, I'll be working on the new functions and new scripts.
However, since a scan script for 1000 posts is taking almost a full day, I won't be doing any updates for a while, except for testing.
Meanwhile, kbd512 has his Charger back, so perhaps he might be willing to run his TestID update program.
He left off at 10,900 (as I recall) .... the grand total of available TestID's is 18,344 (again, as I recall).
(th)
Offline
For SpaceNut re Posts Repair Initiative...
Yesterday's run from 4001 to 5000 completed successfully in 18 hours.
The increase in time per cycle occurred, but the increase remained slow enough so that the desired goal was achieved within 24 hours.
With that success in hand, I am planning a campaign of 75 Earth days, to complete review of the entire set of posts through 80,000.
I'll also be working on new code and new scripts as described in Post #1637.
Completion of the entire project within 2021 looks feasible.
Completed Sequence for ID: 5000
Total Command Lines found: 18
Total input Lines in script: 76Number of ID's processed: 1000
Starting Number: 4001
Last Number of Run: 5000
Summary for Web Automation Report for 09-30-2021 at 07:37:59
Average time of Loop from Main form: 00:01:06Total time of Processing: 18:23:56
Total time Program was Active: 22:12:11
For SpaceNut ... here is the updated Inventory of work items:
Inventory of Todo Items NewMars Posts Repair Initiative
2021/09/29 4001-5000 Admins 53 Bandits 509
2021/09/27 3001-4000 Admins 29 Bandits 858
2021/09/26 2001-3000 Admins 15 Bandits 622
2021/09/25 1001-2000 Admins 79 Bandits 388
2021/09/22 101-1000 Admins 103 Bandits 114
2021/09/21 001-100 Admins 0 post_uid 21
Here's a curiosity ... I don't know what to make of it .... In the current batch, 5001-6000, a post appeared that had the quote format without a bandit.
Subsequently a post showed up with bandits as usual.
It is interesting that in the days when bandits were being created, there was at least one instance when the bandits did not appear.
It's probably just an academic observation ... the run from 5001-6000 is going to include some bandits.
The average cycle time is holding at .... surprise! The average declined from 1:05 to 1:04 .... I wonder if FluxBB is making an accommodation?
Perhaps it was just cranky about being asked to look at 20 year old posts, and has gotten used to the idea, so it is now grudgingly performing, albeit slowly.
Update at 19:53 local time ... WBA is past midway through tonight's run ... The average cycle time is up to 1:13, which implies it will finish before sunrise here if it maintains the current pace of slowing.
** Very ** nice to see Calliban back, and with so many lengthy posts on so many topics.
(th)
Offline
Not sure which topic it is but found post id 5006 did have the quote error and that the topic its in has lots of the errors http://newmars.com/forums/viewtopic.php?id=2666
Offline
For SpaceNut re #1639
Thanks for the head's up! WBA is currently scanning from 5001 through 6000. It will finish the run sometime tomorrow morning. My procedure is to save the file to a USB stick and start the next run.
I take the USB stick to another system where I run an Extract to produce the result I've posted to show current status.
#5006 you found should be included in tomorrow's report.
If you want to get started on fixing the Admins, you can ask me to hurry up and give you the list of Admins with Bandits.
***
Reminder ... kbd512 left off updating the TestID's at 10,900. The complete set runs through 18,344.
Update next day at 6:28 local time ...
WBA completed a scan to 6000 ... Cycle time reached 1:16, which is well under 1:26 (24 hour total time).
Completed Sequence for ID: 6000
Total Command Lines found: 18
Total input Lines in script: 76Number of ID's processed: 1000
Starting Number: 5001
Last Number of Run: 6000
Summary for Web Automation Report for 10-01-2021 at 06:29:58
Average time of Loop from Main form: 00:01:16Total time of Processing: 21:15:16
Total time Program was Active: 21:46:40
Results of the scan will be posted later today.
(th)
Offline
For SpaceNut re Posts Repair initiative....
A scan from 6001-7000 is underway ... here are the results from the run from 5001-6000:
Inventory of Todo Items NewMars Posts Repair Initiative
2021/09/30 5001-6000 Admins 78 Bandits 595
2021/09/29 4001-5000 Admins 53 Bandits 509
2021/09/27 3001-4000 Admins 29 Bandits 858
2021/09/26 2001-3000 Admins 15 Bandits 622
2021/09/25 1001-2000 Admins 79 Bandits 388
2021/09/22 0101-1000 Admins 103 Bandits 114
2021/09/21 0001-0100 Admins 0 post_uid 21
Update at 17:28 local time ...
OK SpaceNut ... it appears we'll just have to get used to slow lookups from FluxBB when we are going back 20 years or so.
Data in that range may well be archived on slow media, or quite possibly even compressed for long term storage.
I launched today's run from 6001-7000 this morning (local time) and found it had run out of gas after only 200 cycles.
It is back running again, and it might finish some time tomorrow, or it might not.
The specific behavior that I'm calling to your attention does not necessarily require any action on your part, since the scan seems to be working normally after a halt. Never-the-less, it is (probably?) useful for you to know what I am seeing, in case someone asks about it in future.
A consideration for "normal" customer(user) operation of the NewMars lookup is by topic, so it is entirely possible the design of the system is optimized for topic oriented lookup.
Since I am doing something that would not be done by an ordinary member, my report may not be applicable to them.
That said, what I am seeing is a variable time delay from (on the order of) 30 seconds all the way up to 60 seconds, for a lookup by Post ID.
The wBA program is set to give up after 60 seconds, and I think that provision should be left in place.
We are seeing variable performance ... individual lookups can be less than the mean or greater.
The end-of-job report only shows the ** average ** of lookup times, and not the extremes.
It would appear to ** NOT ** be guaranteed we can complete the scan of the region from 1 to 80,000 this year, but (on the other hand) it would appear to be possible. In any case, whether an individual run takes a day or more than a day, you'll know with precision exactly how many bandits there are, and how many Admin posts exist within a particular sequence.
Development of code/scripts to refine the data are still pending, and will occur no sooner than next week.
***
It is fun seeing the effect of Calliban's return!
***
Update at 19:27 local time ... The new ETA for WBA's run to 7000 is 10.3 tomorrow. The actual ETA is likely to be later.
Average cycle time is showing as 1:24 ... that number is ** just ** under the 24 hour limit of 1:26
***
For SpaceNut ... if you want to test your (renowned and supreme) search skills, please see if you can find the latest post that contains a bandit.
We know it is somewhere less that 80,000, but the exact post number after which the (artifacts as you call them) are not found is (to me at least) unknown.
(th)
Offline
id= 6002 for topic http://newmars.com/forums/viewtopic.php?id=2746 started back in 2002
Offline
For SpaceNut re #1642 ... thanks for the interesting link to Topic 2746
The scan run to 7000 completed a few minutes ago local time.
It was broken up into three separate sections, so I'll combine the logs into one file to cover 6001-7000
Completed Sequence for ID: 7000
Total Command Lines found: 18
Total input Lines in script: 76Number of ID's processed: 493
Starting Number: 6508
Last Number of Run: 7000
Summary for Web Automation Report for 10-02-2021 at 10:26:16
Average time of Loop from Main form: 00:01:28Total time of Processing: 12:06:10
Total time Program was Active: 12:14:11
I'll post the statistics as soon as they are ready.
Inventory of Todo Items NewMars Posts Repair Initiative
2021/10/01 6001-7000 Admins 75 Bandits 578
2021/09/30 5001-6000 Admins 78 Bandits 595
2021/09/29 4001-5000 Admins 53 Bandits 509
2021/09/27 3001-4000 Admins 29 Bandits 858
2021/09/26 2001-3000 Admins 15 Bandits 622
2021/09/25 1001-2000 Admins 79 Bandits 388
2021/09/22 0101-1000 Admins 103 Bandits 114
2021/09/21 0001-0100 Admins 0 post_uid 21
(th)
Offline
Total number of topics: 7,729 but id is at 10043 for the next new topic to take.
a no topic gives
Info
Bad request. The link you followed is incorrect or outdated.
restored after great crash id is 6023 for 2008
at time of crash 5063 is last topic for 2006
4143 through 4844 are another group in 2006
that means we have another group thats mia of topic
Offline
For SpaceNut re #1644 ....
The number of topics you reported is quite simply, ** amazing **
It is true that some members create new topics without checking earlier ones, but on the other hand, I have noticed that oftentimes, existing topics don't match the new content to be posted.
***
WBA just timed out after 60 seconds, waiting for a response for Post #7131
I had noticed the timeout counter reached 59 a few minutes ago, so I made an adjustment to WBA to permit replacement of the default value of 60 seconds with whatever the operator thinks might make sense.
I'll replace WBA and restart the run to 8000 from 7131.
Update at 17:08 local time ... For SpaceNut .... It finally came to me that the browser may be the cause of the delays I've been seeing.
I put the program into step mode, and observed delays where there should not have been any.
I've been running the browser full tilt for months. It saves a lot of data with each action, and the delay I am seeing may be caused by attempts to save new data. I left the laptop running a History Purge. It was cranking on the "Browser Autofill" option.
There were 58,000+ items reported as saved as Browser history.
192,864,256 bytes were removed. That is all data collected by the browser since it was installed. Most would have accumulated doing the NewMars runs, since the laptop wasn't used online previously.
I'll report on performance after the purge.
Update at 20:12 local time ... The cache was apparently the reason for the delay. A cycle is now down to 23 seconds and holding, and that is with ** addition ** of a step to reset the address bar between post lookups. In my defense, I've never encountered this particular situation before, but then, I've never tried to process 80,000 lookups before either.
(th)
Offline
For SpaceNut re Posts Repair Initiative...
Clearing the cache seems to have corrected the delay problem reported earlier.
Completed Sequence for ID: 8000
Total Command Lines found: 25
Total input Lines in script: 87Number of ID's processed: 870
Starting Number: 7131
Last Number of Run: 8000
Summary for Web Automation Report for 10-03-2021 at 07:37:42
Average time of Loop from Main form: 00:00:25Total time of Processing: 06:08:37
I will now clear the cache before starting a daily scan of 1000 posts.
The results of the scan of 7001-8000 will be posted here when they are ready.
Today's run from 8001-9000 will start shortly.
Update at 10:12 local time ... It would appear that 3000 posts can be scanned per day, if the cache is cleared before each batch of 1000.
Inventory of Todo Items NewMars Posts Repair Initiative
2021/10/02 7001-8000 Admins 130 Bandits 571
2021/10/01 6001-7000 Admins 75 Bandits 578
2021/09/30 5001-6000 Admins 78 Bandits 595
2021/09/29 4001-5000 Admins 53 Bandits 509
2021/09/27 3001-4000 Admins 29 Bandits 858
2021/09/26 2001-3000 Admins 15 Bandits 622
2021/09/25 1001-2000 Admins 79 Bandits 388
2021/09/22 0101-1000 Admins 103 Bandits 114
2021/09/21 0001-0100 Admins 000 post_uid 21
The Admins were (apparently) more active in the series from 7001, but the members held steady.
(th)
Offline
For SpaceNut re Posts Repair Initiative
The scan from 8001-9000 just ended...
Inventory of Todo Items NewMars Posts Repair Initiative
2021/10/03 8001-9000 Admins 57 Bandits 434
2021/10/02 7001-8000 Admins 130 Bandits 571
2021/10/01 6001-7000 Admins 75 Bandits 578
2021/09/30 5001-6000 Admins 78 Bandits 595
2021/09/29 4001-5000 Admins 53 Bandits 509
2021/09/27 3001-4000 Admins 29 Bandits 858
2021/09/26 2001-3000 Admins 15 Bandits 622
2021/09/25 1001-2000 Admins 79 Bandits 388
2021/09/22 0101-1000 Admins 103 Bandits 114
2021/09/21 0001-0100 Admins 000 post_uid 21
A run from 9001-10000 will start shortly.
(th)
Offline
For SpaceNut ... I'm putting this in Housekeeping because chances are you are the only person who will see it, and it is local for you ...
https://www.yahoo.com/news/protesters-r … 00929.html
Protesters return to Merrimack Station, demand closure
Jon Phelps, The New Hampshire Union Leader, Manchester
Sun, October 3, 2021, 6:47 PM
Oct. 3—Six months ago, Sharon Boschert said she moved to New Hampshire from northern California because of forest fires."I couldn't breathe," she said. "The fires were so bad and I was hundreds of miles away from the fires."
On Sunday afternoon, she held a sign that read "Clean Energy Now" and joined more than 100 others in a protest calling for the Merrimack Station coal-burning power plant to be shut down. The plant is operated by Granite Shore Power.
Unlike protests in years past, no participants were arrested. The group is part of the ongoing No Coal, No Gas campaign, a grassroots coalition organized to stop the burning of coal and other fossil fuels for electrical generation in New England.
(th)
Offline
Wow 6 months and now just in the news... I could have said as much just looking at all of the out of state license plates of course the wild fires are not the only reason for the migration...
I mentioned how thick it was a few weeks ago as the smoke hung in the air making it of poor breathing quality.
Offline
For SpaceNut re #1649
The report was about the antiquated practice of burning coal to make power in New Hampshire.
The report was dated yesterday. The protester moved to New Hampshire six months ago and discovered the air was thick with coal smoke.
According to Wikipedia New Hampshire has a nuclear plant: Seabrook Station Nuclear Power Plant
Construction began in 1976. Almost 50 years. Is is still operating? Are there plans to close it?
Why not build more to replace the coal plants?
(th)
Offline