You are not logged in.
For SpaceNut re Email Outreach campaign ....
A curiosity of the FluxBB email subsystem is that it provides no feedback on whether it worked or not.
I stopped in at FluxBB.org to see if anyone had ever discussed the mail subsystem, and I found one topic.
That topic included a link to a github repository, and there I found this:
$mail_to = $recipientname." <".$recipientemail.">";
$mail_subject = pun_htmlspecialchars($_POST['message_subject']);
$mail_message = pun_htmlspecialchars($_POST['message_body']);pun_mail($mail_to, $mail_subject, $mail_message);
From this I deduce that we could see the pun_mail code if we re-connect to the forum source code.
My guess is that the procedure .... I stopped because I have ** no ** idea what it's doing .... if it launches the resident email program, that could be anything.
If it has it's own email program, that would have to work within the operating system.
Interesting!
(th)
Offline
For SpaceNut re Pothole Scan ...
Completed Sequence for ID: 83000
Total Command Lines found: 26
Total input Lines in script: 183Number of ID's processed: 1000
Starting Number: 82001
Last Number of Run: 83000
Summary for Web Automation Report for 08-24-2022 at 07:49:00
Average time of Loop from Main form: 00:00:20
2 Skip Exceptions were recorded.Total time of Processing: 11:28:17
Total time Program was Active: 11:31:48
Potholes: 82143 82018
Edits: Zero
This evening's scan will try for 84000
(th)
Offline
I have not received any other emails via the emailer of the forum as of today but did the previous but as you indicated the time out might have cause it not to be performed.
Offline
For SpaceNut re 2453
Thanks for your report on NOT receiving confirmation emails.
I am ** definitely ** getting the impression the email service is unreliable. In the days ahead, my plan is to set up a run as follows:
1) My test account
2) The candidate
3) SpaceNut
4) My test account
It is possible that on a given day, only the first email goes out.
it that is the case, then I'll change the sequence so the candidate is first, and the confirmation emails won't matter.
I ** know ** that multiple emails can work, because RobertDyck complained about receiving one.
I'll have to go back to see if his username was first in the list on the day it was sent.
As another experiment, I could stack 10 transmissions to my test account.
It just crossed my mind that I could add the time of day of the transmission, in order to find out which messages got sent, assuming ** any ** get sent!
(th)
Offline
For SpaceNut re Potholes report...
Three potholes were reported:
83340 83213 83203
Tomorrow's run will try for 85000 - goal is 199396 so should finish in 2022
***
Good to see Mars_B4_Moon back after a break!
(th)
Offline
For SpaceNut re Potholes report...
Completed Sequence for ID: 85000
Total Command Lines found: 26
Total input Lines in script: 183Number of ID's processed: 1000
Starting Number: 84001
Last Number of Run: 85000
Summary for Web Automation Report for 08-26-2022 at 07:53:23
Average time of Loop from Main form: 00:00:20
6 Skip Exceptions were recorded.Total time of Processing: 12:41:53
Total time Program was Active: 12:45:15
Potholes found: (6) 84352 84207 84157 84107 84106 84052
Tomorrow's Pothole scan will try for 86000 - today's high water mark is 299510
Update at 10:26 local time ... both confirmation emails arrived this morning. You should have received one.
(th)
Offline
For SpaceNut re Pothole Scan...
WBA reached 86000 ...
40 skips were recorded ... analysis to follow
SK: 85084 Bad request
SK: 85103 Bad request
SK: 85153 Bad request
SK: 85202 Bad request
SK: 85203 Bad request
SK: 85240 Bad request
SK: 85247 Bad request
SK: 85248 Bad request
SK: 85249 Bad request
SK: 85251 Bad request
SK: 85252 Bad request
SK: 85254 Bad request
SK: 85257 Bad request
SK: 85259 Bad request
SK: 85260 Bad request
SK: 85261 Bad request
SK: 85264 Bad request
SK: 85265 Bad request
SK: 85272 Bad request
SK: 85274 Bad request
SK: 85289 Bad request
SK: 85298 Bad request
SK: 85300 Bad request
SK: 85379 Bad request
SK: 85380 Bad request
SK: 85406 Bad request
SK: 85411 Bad request
SK: 85412 Bad request
SK: 85413 Bad request
SK: 85414 Bad request
SK: 85465 Bad request
SK: 85466 Bad request
SK: 85467 Bad request
SK: 85469 Bad request
SK: 85470 Bad request
SK: 85483 Bad request
SK: 85502 Bad request
SK: 85527 Bad request
SK: 85532 Bad request
SK: 85566 Bad request
Seems like a lot ...
Tentative plans for today...
1) Test script to convert zero post ID's to TestID's
2) Send one email
3) Advance code for grid capture
4) Pothole scan to 87000
(th)
Offline
It looks like we are closing in on the first fail attack area...
Offline
For SpaceNut re #2458
Thanks for your historical perspective!
*** FYI ... It's been a ** long ** time since I ran the TestID script. I found an old copy and tried it on the set of usernames with Zero Posts that you provided.
It took quite a while to remember how to use the script, which requires an adjustment to the setup screen I had forgotten.
In addition, the script had a tab in a location where it is no longer needed. I have no idea what happened, but removing that tab allowed the script to run properly.
We should have 31 more TestID's in the banned list when the script finishes today.
Our Outreach emails seem to have gone out.
(th)
Offline
For SpaceNut re users with Zero posts
Please double check ... I just ran a search for users with zero posts, and all I found are TestID's
It could he that the list you gave me to convert was the last of them, and I converted that group.
However, there may be others that I can't see for some reason.
Just FYI ... I don't expect to need the TestID script again (if there are no users with zero posts)
However, just in case, I updated the script to perform a mouse click on the Bans page. This will avoid the tab.
(th)
Offline
For SpaceNut re Pothole Scan ....
Reminder: Please double check ... I ** think ** we have converted all Zero Post usernames, but would appreciate your confirming those usernames are now gone.
Completed Sequence for ID: 87000
Total Command Lines found: 26
Total input Lines in script: 183Number of ID's processed: 1000
Starting Number: 86001
Last Number of Run: 87000
Summary for Web Automation Report for 08-28-2022 at 09:21:24
Average time of Loop from Main form: 00:00:20
90 Skip Exceptions were recorded.Total time of Processing: 13:20:09
Total time Program was Active: 13:29:40
Skip analysis for 90 Skips:
SK: Skipping 86015
SK: Skipping 86020
SK: Skipping 86028
SK: Skipping 86029
SK: Skipping 86034
SK: Skipping 86054
SK: Skipping 86055
SK: Skipping 86056
SK: Skipping 86058
SK: Skipping 86066
SK: Skipping 86080
SK: Skipping 86083
SK: Skipping 86084
SK: Skipping 86088
SK: Skipping 86091
SK: Skipping 86092
SK: Skipping 86126
SK: Skipping 86128
SK: Skipping 86132
SK: Skipping 86136
SK: Skipping 86138
SK: Skipping 86139
SK: Skipping 86143
SK: Skipping 86145
SK: Skipping 86149
SK: Skipping 86153
SK: Skipping 86164
SK: Skipping 86170
SK: Skipping 86171
SK: Skipping 86172
SK: Skipping 86185
SK: Skipping 86186
SK: Skipping 86191
SK: Skipping 86192
SK: Skipping 86196
SK: Skipping 86205
SK: Skipping 86212
SK: Skipping 86221
SK: Skipping 86223
SK: Skipping 86234
SK: Skipping 86240
SK: Skipping 86261
SK: Skipping 86267
SK: Skipping 86268
SK: Skipping 86308
SK: Skipping 86309
SK: Skipping 86318
SK: Skipping 86338
SK: Skipping 86342
SK: Skipping 86343
SK: Skipping 86346
SK: Skipping 86347
SK: Skipping 86349
SK: Skipping 86351
SK: Skipping 86352
SK: Skipping 86354
SK: Skipping 86355
SK: Skipping 86356
SK: Skipping 86357
SK: Skipping 86358
SK: Skipping 86360
SK: Skipping 86437
SK: Skipping 86446
SK: Skipping 86451
SK: Skipping 86458
SK: Skipping 86463
SK: Skipping 86467
SK: Skipping 86469
SK: Skipping 86470
SK: Skipping 86471
SK: Skipping 86472
SK: Skipping 86473
SK: Skipping 86474
SK: Skipping 86477
SK: Skipping 86478
SK: Skipping 86480
SK: Skipping 86481
SK: Skipping 86482
SK: Skipping 86483
SK: Skipping 86484
SK: Skipping 86485
SK: Skipping 86487
SK: Skipping 86496
SK: Skipping 86497
SK: Skipping 86498
SK: Skipping 86523
SK: Skipping 86528
SK: Skipping 86538
SK: Skipping 86540
SK: Skipping 86563
Boy! I hope things start returning to "normal" soon!
***
I ** really ** appreciate Mars_B4_Moon bringing those old topics back to life, and thus showing the benefits of your months long campaign to make them readable.
I'll launch the email outreach shortly.
Tonight's Pothole scan will try for 88000.
(th)
Offline
For SpaceNut re Pothole Scan ...
We did a bit better in the latest run ... there were only 14 potholes
14 Skip Exceptions were recorded.
SK: Skipping ID: 87072
SK: Skipping ID: 87128
SK: Skipping ID: 87206
SK: Skipping ID: 87260
SK: Skipping ID: 87296
SK: Skipping ID: 87325
SK: Skipping ID: 87334
SK: Skipping ID: 87352
SK: Skipping ID: 87371
SK: Skipping ID: 87372
SK: Skipping ID: 87373
SK: Skipping ID: 87380
SK: Skipping ID: 87383
SK: Skipping ID: 87384
Potholes can occur "naturally" if a member deletes a post.
In addition, an administrator can "move" a post from one topic to another.
A "Move" involves creating a new copy in another topic, and deleting the original.
However, these events ** should ** be rare, so I would expect a "normal" report to be zero potholes.
Tonight's run will try for 89000
We have a candidate lined up for today's email outreach.
(th)
Offline
For SpaceNut re Potholes Scan ....
59 was the count this time ...
Completed Sequence for ID: 89000
Total Command Lines found: 26
Total input Lines in script: 183Number of ID's processed: 1000
Starting Number: 88001
Last Number of Run: 89000
Summary for Web Automation Report for 08-30-2022 at 07:50:19
Average time of Loop from Main form: 00:00:20
59 Skip Exceptions were recorded.Total time of Processing: 12:13:39
Total time Program was Active: 12:16:17
Skip analysis for 59 potholes:
Post ID: 88035
Post ID: 88036
Post ID: 88037
Post ID: 88046
Post ID: 88048
Post ID: 88050
Post ID: 88077
Post ID: 88085
Post ID: 88086
Post ID: 88087
Post ID: 88100
Post ID: 88101
Post ID: 88127
Post ID: 88128
Post ID: 88129
Post ID: 88147
Post ID: 88148
Post ID: 88149
Post ID: 88183
Post ID: 88184
Post ID: 88185
Post ID: 88198
Post ID: 88200
Post ID: 88202
Post ID: 88206
Post ID: 88230
Post ID: 88231
Post ID: 88232
Post ID: 88233
Post ID: 88234
Post ID: 88238
Post ID: 88269
Post ID: 88270
Post ID: 88283
Post ID: 88326
Post ID: 88328
Post ID: 88329
Post ID: 88339
Post ID: 88345
Post ID: 88346
Post ID: 88347
Post ID: 88378
Post ID: 88379
Post ID: 88384
Post ID: 88388
Post ID: 88430
Post ID: 88436
Post ID: 88465
Post ID: 88466
Post ID: 88488
Post ID: 88489
Post ID: 88528
Post ID: 88529
Post ID: 88530
Post ID: 88536
Post ID: 88537
Post ID: 88538
Post ID: 88544
Post ID: 88545
Tonight's run will try for 90,000 ... today's post ID is 199700 - 110 days estimated to finish
Per Google: Today's date is 29-Aug-2022 (UTC). Today's Julian Date is 22241 .
241+110 >> 351 ... Projected finish is in mid-December
*** Inside baseball section...
During the final days of the Post Repair initiative, I began to explore the possibility of "reading" the browser display like a human, by "looking" at pixels.
The first stage of exploration went fairly smoothly. A test program is able capture pixel properties from a section of screen as specified by the operator.
The resulting file shows the pixel location and it's properties for all the pixels requested.
The ** next ** stage of exploration is taking a bit longer ....
It is possible for a spreadsheet program to show a magnified version of the captured grid, but getting from here to there is a bit more involved than I realized.
Today I finished coding and testing the first of four phases of processing needed to deliver an XML file that can be imported to a spreadsheet.
The steps are:
1) Open header XML, show it to operator, and copy to output if approved <done>
2) Open model XML for rows, show it to operator, and copy it to memory for the next step
3) Open capture data, show it to the operator, and deliver XML for each cell if approved
4) Open trailer XML, show it to operator, and copy it to output if approved.
At the rate I'm going, finish is about a week out. Fortunately, there is no hurry, because the email and pothole scripts are running fine without it.
However, ** you ** may come up with something you need for which this would be useful, so I'm planning to finish it.
(th)
Offline
by looking at the next post count it appears that we are now in 2006....
Offline
For SpaceNut re Pothole Scan ...
We show only 14 potholes in last night's run to 90000. That is an improvement, for sure!
The average time-per-post is 20 seconds. Tonight I'll try a run to 92000. It will take just under 12 hours, so the laptop will be busy all night, if all goes well.
***
Looking ahead to December ... on the 28th (more or less) Mars will complete another orbit. We've been observing the daily movements of Mars since 2019/01/20, and we observed completion the of Year 35 on February 9, 2021. Mars is in it's final quarter, and we'll be observing completion of Year 36.
It seems to me the Proposed Business Calendar for Mars has held up well, over the past several Earth years. At some point, I am hoping this proposed calendar will be considered for adoption by the Mars Society, so that it can be offered for sale as a fund raising method for the non-profit goals of the Society. It is late but not TOO late to begin planning for printing of a calendar to cover Year 37. That calendar would simultaneously cover two Earth years: 2023 and 2024.
The end of Mars Year 37 would occur near November 14, 2024. There's a great deal of Mars artwork/photograph images that could be added to a Mars Calender offering by the Mars Society. The calendar could be offered in both print and digital form.
***
In the Simple OCR study, my hope is to complete two more of the five phases of the XML writer feature.
Update at 12:07 local time. Modest progress ... a step in proceeding to phase two of file is opening the input again.
I decided to deal with this issue by expanding the versatility of the simple program to allow for successive input file opens.
For Email Outreach, I'll review at least one candidate, and perhaps more, depending upon how it goes.
Update at 12:07 ... I found a candidate and sent an email. You should be receiving your copy.
This candidate is from 2003, so it is ** highly ** unlikely we will hear back.
Thanks for helping with the fire discussion. The best preventive measure would ** seem ** to be avoiding combustible materials as much as possible. I note that the NASA high school project reported on by the North Houston chapter of NSS specifically includes fire analysis of materials in the scrubbing process.
The North Houston chapter recently featured a speaker who's been helping with the high school outreach campaign, and he reported on a successful development of a folding table that was eventually put into service on the ISS.
If you're interested in seeing the talk, it is saved as a YouTube video, via a link from northhoustonspace.org
Skip analysis for 14 skips:
SK: Post ID: 89059
SK: Post ID: 89114
SK: Post ID: 89216
SK: Post ID: 89238
SK: Post ID: 89240
SK: Post ID: 89241
SK: Post ID: 89301
SK: Post ID: 89330
SK: Post ID: 89362
SK: Post ID: 89400
SK: Post ID: 89415
SK: Post ID: 89504
SK: Post ID: 89533
SK: Post ID: 89534
Update at 20:22 local time .... a Pothole Scan over 2000 posts is under way ... it should finish before 8 AM tomorrow.
The process slows down as data accumulates in the browser history, so I'm not sure if the performance will remain at 20 seconds per post.
(th)
Offline
For SpaceNut re Pothole Scan ...
The run that finished this morning was set for 2000 posts....
It reached the finish line with a Green Screen and ** only ** 20 potholes...
However, it caused Google to notice... I've never seen this message before, but 12 hours was long enough to cause it:
About this page
Our systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen?
IP address: (this IP address)
Time: 2022-09-01T11:11:12Z
URL: https://www.google.com/search?q=92000&r … e&ie=UTF-8
What's curious about this is that it makes reference to 92000, which was the closing point for the run.
I'll set tonight's run for 2000 again, and it will be interesting to see if the robot detection algorithm shows up again.
The run that just ended gave these results:
Completed Sequence for ID: 92000
Total Command Lines found: 26
Total input Lines in script: 183Number of ID's processed: 2000
Starting Number: 90001
Last Number of Run: 92000
Summary for Web Automation Report for 09-01-2022 at 08:19:17
Average time of Loop from Main form: 00:00:20
20 Skip Exceptions were recorded.Total time of Processing: 12:00:32
Total time Program was Active: 12:04:08
Lines containing String One 34 << SpaceNut posts
Lines containing String Two 2,000 posts in run
Lines containing String Three 60 << potholes / 3 >> 20
Lines containing String Four 40 << potholes / 2 >> 20SK: ID: 90006
SK: ID: 90007
SK: ID: 90031
SK: ID: 90117
SK: ID: 90147
SK: ID: 90168
SK: ID: 90174
SK: ID: 90182
SK: ID: 90222
SK: ID: 90280
SK: ID: 90297
SK: ID: 90338
SK: ID: 90340
SK: ID: 90374
SK: ID: 90412
SK: ID: 90424
SK: ID: 90441
SK: ID: 90454
SK: ID: 90463
SK: ID: 90464
In review of the report of potholes above, I note that the last detection was 90464, and the next 1536 were problem free. A few days back you recalled a rough patch during transition to FluxBB, so I'm wondering if tonight's run might be clean as a whistle.
Work will continue today on Simple OCR ... 2 of 5 phases of XML write are finished. Phase 3 is analysis of the Row Model, and it should go fairly smoothly. Phase 4 is analysis and processing of the captured pixel data. Phase 5 is appending of the trailer XML and it should go quickly.
I still think it will take several days to reach the near term goal, given the (small) amount of time allocated each day.
Update at 15:17 local time ... For the Simple OCR project, today was two steps back (with recovery) and no steps forward.
The procedure involves opening four files in sequence, and there were some loose ends (ie, loops) in the first two phases. Hopefully those have been cleaned up so work can resume on input of the pixel data tomorrow.
(th)
Offline
For SpaceNut
Gioogle noticed the 12 hour run again:
About this page
Our systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen?
IP address: (this one)
Time: 2022-09-02T11:22:59Z
URL: https://www.google.com/search?q=94000&r … e&ie=UTF-8
This time I decided to try the reCAPTCHA option next to "Is this really you" ... It called a combine a tractor, which indicates to me that some employee of reCAPTCHA miscategorized the combine. That would have caused confusion for a lot of humans, as it did for me, but the reCAPTCHA algorithm was satisfied when I clicked on the combine.
The run itself seemed to go well...
Completed Sequence for ID: 94000
Total Command Lines found: 26
Total input Lines in script: 183Number of ID's processed: 2000
Starting Number: 92001
Last Number of Run: 94000
Summary for Web Automation Report for 09-02-2022 at 08:04:26
Average time of Loop from Main form: 00:00:20
21 Skip Exceptions were recorded.Total time of Processing: 11:33:41
Total time Program was Active: 11:36:28
Only 21 potholes for 2000 posts seems like a reasonable number, considering what we've seen recently.
Still, I would expect a count of zero.
Investingating further ... it appears that reCAPTCHA started well before the end.
The first occurance of the word is at 93149.
However, the program was misbehaving before then.
Working backward ... the program was off the rails at 92974
It looks as though the last "good" post processing occurred at 92433
A series of potholes showed up around 92439
A "good" post was processed at 92449
OK ... now I'm thinking it will be necessary to look more carefully at the logs of the Pothole Scans to see if they have gone off the rails in a similar manner.
A new (to me for sure) failure mode has shown up ... The post does not generate a "bad request" but no content is delivered ...
OK ... enough analysis for now .... I'll suspend pothole processing until I understand the problem.
Update at 19:35 local time .... preliminary analysis seems to reveal a problem with how the script handles the transition from submitting a request to edit a post, and processing of the delivered data.
In a nutshell, i suspect a simple tab is used to move from the address bar to the edit window, and if that tab does not take effect due to Internet congestion, then the subsequent processing is not well grounded.
Before running any more scans, i'll study the problem more carefully, and try to find a more reliable solution.
Please note that both BeerMan and GW Johnson report having received acceptance messages from the Conference committee.
(th)
Offline
For SpaceNut re Suspension of Scans....
The excursion into Google's reCAPTCHA territory reported recently have revealed new failure modes.
I suspect they were always present as possibilities, but the slow pace of the previous runs, and the 300 post limit per day, may have prevented putting stress on the system that caused them to show up. Going to 2000 posts in 12 hours ** definitely ** stressed the system to the breaking point.
The first confirmed defect that I found in log analysis yesterday is a failure of the simple TAB that moves from the Address Bar to the display page of the browser.
Under stress, the log shows clearly that the tab may fail to perform it's needed action, so that subsequent operations are performed on the Address Bar instead of the display page.
The new Simple OCR that's been in development recently might be able to deal with that, or at least detect the event.
In any case, I've decided to stop the pothole scans for now.
The main benefit of continuing the pothole scans was to exercise the MySQL database in a systematic way. I think that is still worth doing, but it is not a priority.
On the ** other ** hand, the new Email Outreach campaign will (or promises to) go on for years, at the rate of only one person per day.
***
Did you notice that Mars_B4_Moon and friends put the daily active report over the two full page mark?
When I looked for something yesterday evening, I found it in the middle of the third page of results.
It was ** really ** neat to find noosfractal (from 2007) back in view thanks to Mars_B4_Moon updating the Enceladus topic!
(th)
Offline
For SpaceNut re Pothole Scans ...
I've decided to gather more information before attempting to solve whatever the problem may be that showed up recently, when the FluxBB system was put into a stress test by attempting 2000 posts in a 12 hour session.
I ** did ** add a one second wait before the Tab command that I discovered had failed on at least one occasion.
To gather information, I've added several report commands to the Potholes script.
I'll restart at 78001 and keep the runs to 1000.
The challenge I am facing is how to determine "normal" behavior of the script, when "normal" is characterized by random text written by NewMars members decades ago, and Departure from Normal could be other random characters generated by whatever Internet misbehavior may occur.
An Artificial Intelligence would be able to distinguish actual posted text from the variety of alternatives that show up when things go off the rails.
Unfortunately, this script is a long l - o - n - g way from any kind of intelligence.
(th)
Offline
For SpaceNut re Simple OCR mini-project ...
In study of the behavior of the WBA program, interacting with a browser which is itself interacting with FluxBB, I have concluded that "Edit Post" is a set of pixels that show up consistently when things are going well, and which does NOT show up when any of a myriad of errors occur.
I took a screen capture of the Edit Post section of the view, and then cropped it to 83x20 pixels.
The total number of pixels is given (by GIMP) as 1600. That is a manageable number of pixels to scan.
83x20 is 1660 so GIMP may have rounded down to the nearest whole number.
The actual number needed may be far fewer, because mismatches will occur quickly if the image of Edit Post is compared to the random pixels that show up when any of a myriad of errors occur.
I'll try to store the png on imgur.com
OK ... so ** that's ** the focus of the current development cycle.
(th)
Offline
For SpaceNut re Pothole Scan rerun ...
I reran the scan from 78001 to 79000 ... 2 skips were recorded and they are likely to be potholes.
However, due to recent failures encountered during long runs of this script, I'm planning to review it for other faults that may have occurred.
Completed Sequence for ID: 79000
Total Command Lines found: 31
Total input Lines in script: 195Number of ID's processed: 1000
Starting Number: 78001
Last Number of Run: 79000
Summary for Web Automation Report for 09-03-2022 at 20:13:32
Average time of Loop from Main form: 00:00:21
2 Skip Exceptions were recorded.Total time of Processing: 05:59:09
Three posts were reported with problems - I'll look at them shortly
Match for: Edit completed for: newmars.com/forums/edit Count deleted: 1 Count is: 67
String 2: Completed Sequence for ID: 78511Match for: Edit completed for: newmars.com/forums/edit Count deleted: 1 Count is: 82
String 2: Completed Sequence for ID: 78625Match for: Edit completed for: newmars.com/forums/edit Count deleted: 1 Count is: 83
String 2: Completed Sequence for ID: 78626
All three posts came up fine when tested manually.
The sequence from 78001-79000 has no potholes or other artifacts.
I'll set up a run over the series from 79001-80000 using the new script.
As a side note ... there were 80 SpaceNut posts reported but (of course) those are not reported here.
(th)
Offline
Sounds like the detection of a denial of services code set the activity alert.
I agree with running the sequential data locations even if all it shows are potholes.
Yes, activity for Newmars is at an all-time high and I wish that others would not look at it as old topic churn but making topics relevant and not just making topics for the sake of seeing the key word in the title....
sometimes the posts are not quite on topic, but I would rather have people reading the links and then seeing discussion rather than no posts or discussion.
Offline
For SpaceNut ....
Your comments about topic titles are interesting.
Since you are the Senior Administrator, you have an influence over the flow of activity.
***
The change in approach to running the scans is producing interesting results ...
Last night's run covered 79000-80000
There were 112 posts reported
Of those, 55 were SpaceNut posts
That left 57 which appear to be good posts by regular members
The most logical/reasonable explanation is that Internet congestion is interfering with the Tab that is supposed to move the focus from the Address Bar to the Edit window.
There may not be any potholes at all.
The Simple OCR concept I've been working on may be able to solve the problem, by looking for [Edit Post] in the pixels, before proceeding to process (ie, look at) a post.
Overnight, I realized the presentation density on a device is going to influence the usefulness of the method.
I'm ** pretty ** sure the screens in use for lookup, development and automation all use 1024x768.
However, I need to verify that before putting the new Simple OCR into service.
In the mean time (since the development will take several more days) I'm thinking of making a script to check the ID's that are flagged by the Potholes script.
57 ID's is too many to check by hand.
Update at 10:05 ....
I ran a special Pothole script to check ID's from a file. The run showed that all 57 but 1 were good member posts.
There was one SpaceNut post missed in the longer run.
I added a mouse click before the tab, to try to prevent future incidents involving the tab.
Update at 13:20 local time ...
In a recent post I reported on the challenge/opportunity of dealing with differing computer monitor display densities...
Because my initial capture of the bit map of [Edit Post] was taken on a higher density monitor than 1024x768, I decided to capture the same image on the laptop, which is set to 1024x768. To my surprise the resulting capture was almost exactly the same size... 63 pixels long and 13 pixels high.
It is possible that the browser accepts a request from FluxBB and adapts it to the local situation.
In any case, it would appear that a pattern match to confirm [Edit Post] is present on the screen would involve on the order of 1600 (or so) pixels.
It is even possible a smaller number could be checked to have a strong indication the desired pixels are present.
However, there ** is ** another wrinkle ... in watching the WBA program performing Pothole checks today, I noticed that the location of [Edit Post] varies vertically, depending upon the number of words in the title of the index level topic for the post in question.
That means that the program intending to "look" for [Edit Post] pixels is going to need to look in more than one place.
However, ** that's ** a challenge for some future day.
The status of the Simple OCR inquiry is holding at two phases complete out of five needed.
(th)
Offline
For SpaceNut re Re-Scan of Potholes ...
Last night's run reached 81000
Two skips were reported but analysis is needed before we know what those were
Completed Sequence for ID: 81000
Total Command Lines found: 34
Total input Lines in script: 205Number of ID's processed: 1000
Starting Number: 80001
Last Number of Run: 81000
Summary for Web Automation Report for 09-05-2022 at 09:47:31
Average time of Loop from Main form: 00:00:22
2 Skip Exceptions were recorded.Total time of Processing: 11:18:33
With changes made to try to understand the problems that showed up recently, multiple runs of a tool to extract any actual problems are needed.
Analysis:
SpaceNut posts: 85
Real potholes: 2
Query Text Requested Was Found: Bad request
SK: Skipping to Next Item from Post ID: 80102 Bad request
Query Text Requested Was Found: Bad request
SK: Skipping to Next Item from Post ID: 80437 Bad request
However, those reports may have been due to stress of the run...
The two posts cited are manually confirmed. They are real potholes.
Tonight's Pothole scan will try for 82000
Update at 13:44 local time
This is just for the record .... work on the Simple OCR mini-project advanced a bit today.
There are five phases to generation of an XML file suitable for input to a spreadsheet, to show pixels captured from a live screen.
Today the input (and operator confirmation) steps were advanced through phase five. All that remains is the (relatively straightforward) step of appending the trailer XML to the output file.
Left undone at this point are the details needed to actually match up screen capture data with the XML row model, to generate valid XML rows of pixel data.
Meanwhile, the [Edit Post] graphic was captured on the computer where it needs to be captured at run time. I was (a bit) surprised to find that the pixels generated on the target machine (with 1024x768 resolution) were almost identical to those on a machine whose monitor is running at a much higher resolution.
I deduce from this observation that the browser must be working behind the scenes to render the FluxBB screen display so it looks the same on different machines. Carrying this thought further .... there must be "virtual" pixels, because the hardware underlying the display can have a number of physical pixels that differ from the virtual ones.
(th)
Offline
For SpaceNut re Pothole Rescans ...
WBA reached 82000 in last night's run ...
Completed Sequence for ID: 82000
Total Command Lines found: 34
Total input Lines in script: 206Number of ID's processed: 1000
Starting Number: 81001
Last Number of Run: 82000
Summary for Web Automation Report for 09-06-2022 at 08:37:44
Average time of Loop from Main form: 00:00:22
9 Skip Exceptions were recorded.Total time of Processing: 15:12:03
The report of 9 skips implies 9 potholes, but (at this point) this is just a suggestion.
Detailed analysis is needed to establish what ** really ** happened.
Update at 10:48 ....It turns out that I've encountered a new failure mode, or rather rediscovered an old one.
In both recent runs, the laptop and FluxBB got out of sync ...
The run from 80000 failed at 80634
The run from 81000 failed at 80683
The runs from 79000 and 80000 appear to have completed without encountering this error.
An idea I'm considering is adjusting the script to halt if it finds the Address Bar contents where the page should be.
I ** think ** there's an existing script command to handle that.
In the mean time, I'll restart the scans from the respective failure points.
Starting from 80634 now ...
(th)
Offline