Home |
Search |
Today's Posts |
|
UK diy (uk.d-i-y) For the discussion of all topics related to diy (do-it-yourself) in the UK. All levels of experience and proficency are welcome to join in to ask questions or offer solutions. |
Reply |
|
LinkBack | Thread Tools | Display Modes |
#1
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
(1) Page http://wiki.diyfaq.org.uk/index.php/Main_Page includes "You can discuss this wiki on the talk (discussion) page". I cannot, and the source of that page includes :-
"You do not have permission to edit this page, for the following reason: This page has been protected to prevent editing or other actions." (2) I was going to add - "How about a page for a list of all DIY acronyms used in the Wiki (and in the newsgroup), with their meanings?". In my WinXP, a rough list of all words in capital letters in the source, for all HTM files in the root of the master of my web site, can be generated by mtr -o^^ -wc- *.htm ".* ([A-Z]+) .*" = \1 | sort | dedupe | sort /r | mt Perhaps the Wiki site owner can do similar or better code to get a first approximation to a list? There, mtr is 32-bit MiniTrue and mt is 16-bit MiniTrue, for which see http://www.idiotsdelight.net/minitrue/ and http://adoxa.altervista.org/minitrue/index.html ; dedupe replaces sets of matching lines with a count and one copy. -- SL |
#2
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
|
#3
Posted to uk.d-i-y
|
|||
|
|||
//wiki.diyfaq.org.uk/index.php/Main_Page
"Adrian Caspersz" wrote in message ...
On 20/02/16 13:25, wrote: Perhaps the Wiki site owner can do similar or better code to get a first approximation to a list? What did your last slave die of? This seems to be a common theme with your postings. May ask why? The high turnover of slaves might be down to their tardiness in responding to demands. |
#4
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
wrote in message
... (1) Page http://wiki.diyfaq.org.uk/index.php/Main_Page includes "You can discuss this wiki on the talk (discussion) page". I cannot, and the source of that page includes :- "You do not have permission to edit this page, for the following reason: This page has been protected to prevent editing or other actions." (2) I was going to add - "How about a page for a list of all DIY acronyms used in the Wiki (and in the newsgroup), with their meanings?". In my WinXP, a rough list of all words in capital letters in the source, for all HTM files in the root of the master of my web site, can be generated by mtr -o^^ -wc- *.htm ".* ([A-Z]+) .*" = \1 | sort | dedupe | sort /r | mt Perhaps the Wiki site owner can do similar or better code to get a first approximation to a list? There, mtr is 32-bit MiniTrue and mt is 16-bit MiniTrue, for which see http://www.idiotsdelight.net/minitrue/ and http://adoxa.altervista.org/minitrue/index.html ; dedupe replaces sets of matching lines with a count and one copy. http://wiki.diyfaq.org.uk/index.php/Account_Requests any use? And I am sure John Rumm will see this and give you any help you need. -- Adam |
#6
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Saturday, 20 February 2016 13:43:40 UTC, Adrian Caspersz wrote:
On 20/02/16 13:25, SL wrote: Perhaps the Wiki site owner can do similar or better code to get a first approximation to a list? What did your last slave die of? This seems to be a common theme with your postings. May ask why? It is because I recognise that I do not know everything. The first question must be whether such a list would be considered sufficiently useful, although that depends on how easy it would be to do it. In this case the initial finding of possible entries is best done in one or more of the following ways :- (1) By crowd-sourcing. (2) By running similar code to mine on a system which holds the whole site, preferably doing so not on the source code but on the pages as rendered by a browser (maybe using Lynx). You will perhaps have noticed that, unless there is pre-processing, that code will miss about 10% of acronym instances.. (3) By using a facility in the Wiki code itself, if such exists. (4) By running that or similar code on the Glossary itself. (5) By one or more of others of the many means not known to me. I now know that the Glossary exists and contains acronyms. Since those coming on an acronym might look for an Acronyms page without remembering that there might be a Glossary under that or some other name, perhaps the best answer is to have an Acronyms page in the Index, containing just "See [[Glossary]]". -- SL |
#7
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Saturday, 20 February 2016 13:54:33 UTC, ARW wrote:
SL wrote in message ... ... http://wiki.diyfaq.org.uk/index.php/Account_Requests any use? As a page for general use, yes; as a suggestion, none. And I am sure John Rumm will see this and give you any help you need. A well-meant but unnecessary reassurance. -- SL |
#8
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Saturday, 20 February 2016 14:32:57 UTC, John Rumm wrote:
On 20/02/2016 13:25, SL wrote: (1) Page http://wiki.diyfaq.org.uk/index.php/Main_Page includes "You can discuss this wiki on the talk (discussion) page". I cannot, and the source of that page includes :- "You do not have permission to edit this page, for the following reason: This page has been protected to prevent editing or other actions." I have changed the permissions on that page - you should be able to edit it now when you are logged in. (it was protected back in the days where anonymous edits were allowed, and it used to get spammed hourly!) Good. It definitely looks editable, but I don't now need to do it, because ... (2) I was going to add - "How about a page for a list of all DIY acronyms used in the Wiki (and in the newsgroup), with their meanings?". You mean like a Glossary? How about: http://wiki.diyfaq.org.uk/index.php/Glossary I had forgotten, or not previously seen, that. I did _mean_ a mere Acronyms list, but a Glossary, if sufficiently complete for acronyms, will certainly serve. In my WinXP, a rough list of all words in capital letters in the source, for all HTM files in the root of the master of my web site, can be generated by mtr -o^^ -wc- *.htm ".* ([A-Z]+) .*" = \1 | sort | dedupe | sort /r | mt Perhaps the Wiki site owner can do similar or better code to get a first approximation to a list? You can help yourself to a copy of the content from he http://internode.co.uk/diyfaq_wiki_backup/ I fear it may be too big for this old PC, and "dedupe" and its compiler are 16-bit code ... boots better PC ... I see your backup Index, but the main file size is larger than that which I was testing with is nearly two orders of magnitude bigger than the file set I was testing on, and I don't know how I might open such files. Alas, I can no longer recall the acronym I wanted to look up, except that it was used recently in this group and had three letters with the last being "O" - but HEPVO was used yesterday in "Waste water not draining" and that's not in the Glossary. Google knows about HEPVO, which appears to be not an Acronym; but to be really useful an Acronyms list should include "word"s that look like acronyms .... . And the DNS knows it, but http://www.hepvo.com/ does not give its etymology. Might a HepvO suffer if some of the more vigorous methods of pipe-unblocking were applied General thanks, -- SL |
#9
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 20/02/2016 19:45, wrote:
On Saturday, 20 February 2016 14:32:57 UTC, John Rumm wrote: On 20/02/2016 13:25, SL wrote: (1) Page http://wiki.diyfaq.org.uk/index.php/Main_Page includes "You can discuss this wiki on the talk (discussion) page". I cannot, and the source of that page includes :- "You do not have permission to edit this page, for the following reason: This page has been protected to prevent editing or other actions." I have changed the permissions on that page - you should be able to edit it now when you are logged in. (it was protected back in the days where anonymous edits were allowed, and it used to get spammed hourly!) Good. It definitely looks editable, but I don't now need to do it, because ... (2) I was going to add - "How about a page for a list of all DIY acronyms used in the Wiki (and in the newsgroup), with their meanings?". You mean like a Glossary? How about: http://wiki.diyfaq.org.uk/index.php/Glossary I had forgotten, or not previously seen, that. I did _mean_ a mere Acronyms list, but a Glossary, if sufficiently complete for acronyms, will certainly serve. In my WinXP, a rough list of all words in capital letters in the source, for all HTM files in the root of the master of my web site, can be generated by mtr -o^^ -wc- *.htm ".* ([A-Z]+) .*" = \1 | sort | dedupe | sort /r | mt Perhaps the Wiki site owner can do similar or better code to get a first approximation to a list? You can help yourself to a copy of the content from he http://internode.co.uk/diyfaq_wiki_backup/ I fear it may be too big for this old PC, and "dedupe" and its compiler are 16-bit code ... boots better PC ... I see your backup Index, but the main file size is larger than that which I was testing with is nearly two orders of magnitude bigger than the file set I was testing on, and I don't know how I might open such files. 7 zip will decompress it. After that its just a big set of SQL in a text file that would rebuild the database. (you can ignore the 500meg file!) Alas, I can no longer recall the acronym I wanted to look up, except that it was used recently in this group and had three letters with the last being "O" - but HEPVO was used yesterday in "Waste water not draining" and that's not in the Glossary. Google knows about HEPVO, which appears to be not an Acronym; but to be really useful an Acronyms list should include "word"s that look like acronyms ... . And the DNS knows it, but http://www.hepvo.com/ does not give its etymology. Might a HepvO suffer if some of the more vigorous methods of pipe-unblocking were applied Hep was from Hepworth plastics. Now absorbed along with OSMA by Wavin. The VO was a range of pipe and fittings IIUC. http://www.wavin.co.uk/web/news/show...h-plastics.htm -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#10
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 20/02/2016 21:43, John Rumm wrote:
On 20/02/2016 19:45, wrote: Google knows about HEPVO, which appears to be not an Acronym; but to be really useful an Acronyms list should include "word"s that look like acronyms ... . And the DNS knows it, but http://www.hepvo.com/ does not give its etymology. Might a HepvO suffer if some of the more vigorous methods of pipe-unblocking were applied Hep was from Hepworth plastics. Now absorbed along with OSMA by Wavin. The VO was a range of pipe and fittings IIUC. In fact, looking at: http://www.wavin.co.uk/web/solutions...raps/hepvo.htm It suggests it wont stand up to vigorous cleaning... -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#11
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Saturday, 20 February 2016 13:25:19 UTC, wrote:
(2) I was going to add - "How about a page for a list of all DIY acronyms used in the Wiki (and in the newsgroup), with their meanings?". If you want to compile a list I could upload it into an article. The glossary shows the syntax required (click edit to see it). NT |
#12
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message
om, Sat, 20 Feb 2016 20:21:13, posted: On Saturday, 20 February 2016 13:25:19 UTC, wrote: (2) I was going to add - "How about a page for a list of all DIY acronyms used in the Wiki (and in the newsgroup), with their meanings?". If you want to compile a list I could upload it into an article. The glossary shows the syntax required (click edit to see it). It occurs to me that I already have JavaScript that, if executed in a page in the root directory of a web site, can spider that site checking links and anchors (the real purpose) and as a sideline checking instances of "YYYY-MM-DD DoW" and seeking a shibboleth RegExp. I also have code to traverse the DOM tree of a page as loaded; it could easily enough look for text nodes and seek in them matches for a chosen RegExp that would match most acronyms and not a lot else. I also know an effective way of making a de-duplicated list of matches found. So it would not be so very difficult to write a new-acronym-seeker. Perhaps someone has done it already. The aforesaid script is in view-source:http://web.archive.org/web/201509080...p://www.merlyn ..demon.co.uk/linxchek.htm. H'mm - using \b[A-Z]{4,}\b as the shibboleth RegExp, case-dependent, my site master counts 6896 matches; but that currently (perhaps) scanned the source of the pages, which was good enough for the original purpose. -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#13
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Saturday, 27 February 2016 23:41:08 UTC, Dr J R Stockton wrote:
In uk.d-i-y message om, Sat, 20 Feb 2016 20:21:13, tabbypurr posted: On Saturday, 20 February 2016 13:25:19 UTC, wrote: (2) I was going to add - "How about a page for a list of all DIY acronyms used in the Wiki (and in the newsgroup), with their meanings?". If you want to compile a list I could upload it into an article. The glossary shows the syntax required (click edit to see it). It occurs to me that I already have JavaScript that, if executed in a page in the root directory of a web site, can spider that site checking links and anchors (the real purpose) and as a sideline checking instances of "YYYY-MM-DD DoW" and seeking a shibboleth RegExp. I also have code to traverse the DOM tree of a page as loaded; it could easily enough look for text nodes and seek in them matches for a chosen RegExp that would match most acronyms and not a lot else. I also know an effective way of making a de-duplicated list of matches found. So it would not be so very difficult to write a new-acronym-seeker. Perhaps someone has done it already. The aforesaid script is in view-source:http://web.archive.org/web/201509080...p://www.merlyn .demon.co.uk/linxchek.htm. H'mm - using \b[A-Z]{4,}\b as the shibboleth RegExp, case-dependent, my site master counts 6896 matches; but that currently (perhaps) scanned the source of the pages, which was good enough for the original purpose. Go for it NT |
#14
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message
om, Sun, 28 Feb 2016 02:04:40, posted: On Saturday, 27 February 2016 23:41:08 UTC, Dr J R Stockton wrote: ... So it would not be so very difficult to write a new-acronym-seeker. Perhaps someone has done it already. The aforesaid script is in view-source:http://web.archive.org/web/201509080...p://www.merlyn .demon.co.uk/linxchek.htm. H'mm - using \b[A-Z]{4,}\b as the shibboleth RegExp, case-dependent, my site master counts 6896 matches; but that currently (perhaps) scanned the source of the pages, which was good enough for the original purpose. Go for it It is done, to the extent of getting something that runs and gives about the right output; for example :- .. 1 'PMRF' .. 1 'PNUP' .. 4 'POINTS' .. 1 'POLAR' .. 1 'PONSE' .. 7 'POSIX' .. 1 'POSSIBLY' .. 13 'POST' .. 1 'POSTMASTER' .. 1 'POSTSCRIPT' .. 2 'POTUS' .. 1 'POWERS' .. 1 'PRAEFATIO' The selection needs refining; for example PONSE comes from RÉPONSE which may be seen by the selector as RéPONSE (Euler wrote that word) and others are just ordinary words in upper-case. PMRF is https://en.wikipedia.org/wiki/Pacifi...Range_Facility (at Barking Sands). AFAIR, the uk.d-i-y FAQ would give a much smaller proportion of false positives. An auxiliary program or editor script could convert that list into a pseudo-web page with a list of entries each making a Google (or other) search of the FAQ site to locate the use of each "acronym" there. It needs to be documented! Remember that the archived copy does not do this. I am using Firefox in WinXP sp3, with local files - I don't know whether it would, for example, run on the FAQ server. This page and the site being tested must be on the same system - the "same origin policy". -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#15
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Monday, 29 February 2016 23:42:04 UTC, Dr J R Stockton wrote:
In uk.d-i-y message om, Sun, 28 Feb 2016 02:04:40, tabbypurr posted: On Saturday, 27 February 2016 23:41:08 UTC, Dr J R Stockton wrote: ... So it would not be so very difficult to write a new-acronym-seeker. Perhaps someone has done it already. The aforesaid script is in view-source:http://web.archive.org/web/201509080...p://www.merlyn .demon.co.uk/linxchek.htm. H'mm - using \b[A-Z]{4,}\b as the shibboleth RegExp, case-dependent, my site master counts 6896 matches; but that currently (perhaps) scanned the source of the pages, which was good enough for the original purpose. Go for it It is done excellent, post the list of acronyms and it can go up. NT |
#16
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 29/02/2016 23:09, Dr J R Stockton wrote:
I am using Firefox in WinXP sp3, with local files - I don't know whether it would, for example, run on the FAQ server. This page and the site being tested must be on the same system - the "same origin policy". The server runs CentOS (i.e. designed to be compatible with Red Hat Enterprise Linux). So, yes it could probably run on the server if one ran a local X Server... -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#17
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message
om, Tue, 1 Mar 2016 11:47:14, posted: On Monday, 29 February 2016 23:42:04 UTC, Dr J R Stockton wrote: In uk.d-i-y message om, Sun, 28 Feb 2016 02:04:40, tabbypurr posted: On Saturday, 27 February 2016 23:41:08 UTC, Dr J R Stockton wrote: ... So it would not be so very difficult to write a new-acronym-seeker. Perhaps someone has done it already. The aforesaid script is in view-source:http://web.archive.org/web/201509080...p://www.merlyn .demon.co.uk/linxchek.htm. H'mm - using \b[A-Z]{4,}\b as the shibboleth RegExp, case-dependent, my site master counts 6896 matches; but that currently (perhaps) scanned the source of the pages, which was good enough for the original purpose. Go for it It is done excellent, post the list of acronyms and it can go up. It is done - but for the master copy of a site stored on my local PC and with HTML pages with extensions only htm, html, shtml, xhtml. That is a user-editable list. I do not know, for example, whether having directories with extension php matters. To be used elsewhere, it must be tested elsewhere. My Web site is flattish, i.e. all ordinary pages are in its root folder, and I have no need to check the content of non-root pages. That caused browser-dependent difficulties, years ago. But I have tried LINXCHEK.HTM on a dummy site containing the FAQ's Detergent page, and it did find the acronym SLES. I added the Dimmed_PIR_Lights page; it found several TLAs. What I have so far done was fun, and is mildly potentially useful to me; I may be able to improve it. The "Remote Origin" property of iframe elements means that the page can only check sites that run a copy of itself; and Chrome's interpretation of "Remote Origin" prevents working at all there. Perhaps it should be converted into an HTA; but is continued support for HTAs to be expected? One might say that an acronyms list should never be needed, because all acronyms used in a page either should be so well known that they need no explanation (NATO) or can be found directly in Wikipedia (DHMO), or should be explained at the first occurrence on each page. So perhaps it should give only multiple page entries as acronyms to be listed. There may be no need at all for a news:uk.d-i-y acronyms list because the glossary may suffice, though the glossary is no help for those who have found an acronym and don't know that the glossary exists. I have modernised "LED" in the Glossary - someone please review. I suggest that Glossary, and Acronyms if created, should be added to the little Navigation box at upper left; and "Usenet" should be put before uk.d-i-y just below it. Page Category:Electrical is not age- or gender- neutral; all the pictures are of old males. -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#18
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message ,
Tue, 1 Mar 2016 20:35:38, John Rumm posted: On 29/02/2016 23:09, Dr J R Stockton wrote: I am using Firefox in WinXP sp3, with local files - I don't know whether it would, for example, run on the FAQ server. This page and the site being tested must be on the same system - the "same origin policy". The server runs CentOS (i.e. designed to be compatible with Red Hat Enterprise Linux). So, yes it could probably run on the server if one ran a local X Server... If you can put LINXCHEK.HTM on the server in the root directory of the FAQ, and read it directly in a browser, then it seems likely that it will find candidate acronyms when run there. It would want to be provided with a file list for the site root and subfolders. I use MSDOS Command Prompt DIR /B /S which gives lines like C:\HOMEPAGE\ZYP.$$$ C:\HOMEPAGE\ZYP.BAT C:\HOMEPAGE\ZYP.OK0 C:\HOMEPAGE\20010716\000-WARN.TXT C:\HOMEPAGE\20010716\00INDEX.HTM C:\HOMEPAGE\20010716\00INDEX.TXT That information could be supplied, I suppose, by UNIX LS and either the LS output could be adjusted to mach the above or LINXCHEK could be adjusted to accept either form. I have no experience of UNIX-type systems. Or, maybe, the file set could be copied to a Windows PC and tested there. No doubt Internode could do it - but would that be useful enough [remembering that LINXCHEK was written to check local links and anchors in the master of my (in-limbo) Web site]? I suspect not. -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#19
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Wednesday, 2 March 2016 23:45:27 UTC, Dr J R Stockton wrote:
Page Category:Electrical is not age- or gender- neutral; all the pictures are of old males. Well yes. The article is about oldsters, and there are pictures of females in their prime here http://wiki.diyfaq.org.uk/index.php/Pattress and here http://wiki.diyfaq.org.uk/index.php/Plug_%26_socket NT |
#20
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Wed, 02 Mar 2016 23:43:48 -0800, tabbypurr wrote:
On Wednesday, 2 March 2016 23:45:27 UTC, Dr J R Stockton wrote: Page Category:Electrical is not age- or gender- neutral; all the pictures are of old males. Well yes. The article is about oldsters, and there are pictures of females in their prime here http://wiki.diyfaq.org.uk/index.php/Pattress and here http://wiki.diyfaq.org.uk/index.php/Plug_%26_socket And even male couples. |
#21
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 02/03/2016 17:44, Dr J R Stockton wrote:
In uk.d-i-y message , Tue, 1 Mar 2016 20:35:38, John Rumm posted: On 29/02/2016 23:09, Dr J R Stockton wrote: I am using Firefox in WinXP sp3, with local files - I don't know whether it would, for example, run on the FAQ server. This page and the site being tested must be on the same system - the "same origin policy". The server runs CentOS (i.e. designed to be compatible with Red Hat Enterprise Linux). So, yes it could probably run on the server if one ran a local X Server... If you can put LINXCHEK.HTM on the server in the root directory of the FAQ, and read it directly in a browser, then it seems likely that it will find candidate acronyms when run there. Are you talking about the FAQ particularly, or the wiki? The FAQ is just a static set of HTML web pages. So that can be processed locally. The WIKI is a very different beast, in that there are no static HTML pages at all. There is a large directory structure of images, a huge folder of PHP scripts, and a massive SQL database that contains all the text. Its not until you request a page that the HTML actually gets "written" as such. So you can browse to a page and then save it. But prior so requesting the page there is no HTML file you can look at. It would want to be provided with a file list for the site root and subfolders. I use MSDOS Command Prompt DIR /B /S which gives lines like C:\HOMEPAGE\ZYP.$$$ C:\HOMEPAGE\ZYP.BAT C:\HOMEPAGE\ZYP.OK0 C:\HOMEPAGE\20010716\000-WARN.TXT C:\HOMEPAGE\20010716\00INDEX.HTM C:\HOMEPAGE\20010716\00INDEX.TXT That information could be supplied, I suppose, by UNIX LS and either the LS output could be adjusted to mach the above or LINXCHEK could be adjusted to accept either form. That's fine where you have HTML pages to look at (e.g. the FAQ), but not much help for dynamically generated sites. I have no experience of UNIX-type systems. Used on a desktop machine with a graphical interface, it would behave in much the same way. When you are accessing remote server however, then its a somewhat different setup. Typically you have command line access, but you won't be able to run firefox/Chrome or whatever in that environment. *nix systems do allow the graphical interface and the machine running the applications to be separated though, so you can run an X Windows server on your machine, and then connect to a remote computer via SSH or similar, and have it run a program that needs a GUI using the services of your X server[1]. It works well enough when the machines concerned can see each other on a LAN, although performance takes a nose dive when one is the other end of a much slower internet connection. [1] Client and server terminology gets a bit confusing when talking about X windows. You local computer is a client talking to a remote server, but the server is also a client using the graphics capabilities of your local machine acting as a screen display server! Or, maybe, the file set could be copied to a Windows PC and tested there. No doubt Internode could do it - but would that be useful enough [remembering that LINXCHEK was written to check local links and anchors in the master of my (in-limbo) Web site]? I suspect not. I can zip and email you the FAQ pages if you want... they should be processable with your current setup. I suspect that for processing the wiki, you would be better off first "spidering" the site following all the internal links, and caching local copies of the pages generated. That would save all the complexity of dealing with the server, database, or any working at arms length on a remote machine. (If you have a windows port of the wget command, then that can probably do what you need) -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#22
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message ,
Thu, 3 Mar 2016 13:09:05, John Rumm posted: On 02/03/2016 17:44, Dr J R Stockton wrote: In uk.d-i-y message , Tue, 1 Mar 2016 20:35:38, John Rumm posted: On 29/02/2016 23:09, Dr J R Stockton wrote: I am using Firefox in WinXP sp3, with local files - I don't know whether it would, for example, run on the FAQ server. This page and the site being tested must be on the same system - the "same origin policy". The server runs CentOS (i.e. designed to be compatible with Red Hat Enterprise Linux). So, yes it could probably run on the server if one ran a local X Server... If you can put LINXCHEK.HTM on the server in the root directory of the FAQ, and read it directly in a browser, then it seems likely that it will find candidate acronyms when run there. Are you talking about the FAQ particularly, or the wiki? Both, rather indiscriminately, I fear. Both, as presented to a user's browser, contain links and anchors which might be checked, and acronyms which might be listed. The FAQ is just a static set of HTML web pages. So that can be processed locally. The WIKI is a very different beast, in that there are no static HTML pages at all. There is a large directory structure of images, a huge folder of PHP scripts, and a massive SQL database that contains all the text. Its not until you request a page that the HTML actually gets "written" as such. So you can browse to a page and then save it. But prior so requesting the page there is no HTML file you can look at. But it is the content of the delivered HTML files that matters. The browser JavaScript "same origin" policy stops me here running LINXCHEK to fetch and test pages from elsewhere (I have fetched a couple of "family" sites by saving from a browser and checking locally). H'mm - perhaps I could make LINXCHEK.HTM into a .HTA page, if that can fetch "other-origin" pages - but that would make it an abusable tool which I would not choose to release. But will .HTA continue to be supported? It would want to be provided with a file list for the site root and subfolders. I use MSDOS Command Prompt DIR /B /S which gives lines like C:\HOMEPAGE\ZYP.$$$ C:\HOMEPAGE\ZYP.BAT C:\HOMEPAGE\ZYP.OK0 C:\HOMEPAGE\20010716\000-WARN.TXT C:\HOMEPAGE\20010716\00INDEX.HTM C:\HOMEPAGE\20010716\00INDEX.TXT That information could be supplied, I suppose, by UNIX LS and either the LS output could be adjusted to mach the above or LINXCHEK could be adjusted to accept either form. That's fine where you have HTML pages to look at (e.g. the FAQ), but not much help for dynamically generated sites. Either there would need to be a list of the pages that might be generated, or LINXCHEK would have to be able to ask for a page and recognise the "404-type" response somehow. The site maintainer could, in principle, put a secret word such as zrgnzbecubfvf, in one-point white-on-white, in his 404 page generator ... ... I can zip and email you the FAQ pages if you want... they should be processable with your current setup. Please do so, but only after reading an E-mail from me which should be with you very late today (Monday) , so that it will arrive in a suitable account. I suspect that for processing the wiki, you would be better off first "spidering" the site following all the internal links, and caching local copies of the pages generated. That would save all the complexity of dealing with the server, database, or any working at arms length on a remote machine. LINXCHEK already spiders the site; but it extracts all of the needed information from each page when read - no page caching or re-reading is needed. For each page read, the browser creates lists of links and anchors automatically (it may be scanning the DOM tree to get a list of IDs). (If you have a windows port of the wget command, then that can probably do what you need) -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#23
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message ,
Thu, 3 Mar 2016 13:09:05, John Rumm posted: I can zip and email you the FAQ pages if you want... they should be processable with your current setup. He did. They were. Interesting. Useful to me. More later. -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#24
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message id
, Tue, 15 Mar 2016 23:24:28, Dr J R Stockton ..uk.invalid posted: In uk.d-i-y message , Thu, 3 Mar 2016 13:09:05, John Rumm posted: I can zip and email you the FAQ pages if you want... they should be processable with your current setup. He did. They were. Interesting. Useful to me. More later. LINXCHEK found 246 instances of candidate acronyms, 149 distinct candidates. Most came from subject lines in upper-case, and could be rejected at first glance. Some were on the existing Acronyms List. Some were too well-known to need explanation, such as ISBN and PTFE. A few do IMHO justify an explanation, either in the List or in the pages in which they appear. LINXCHEK also found 15 anchors to be cited and missing, and 10 anchors not cited from within the FAQ. And it inadvertently found quite a number of over-munged "Mailto:"s. Doing this, it read 48 HTML files in about 40 seconds on an old desktop PC - I think that most pages made a Google call on-load, which would have slowed things down a bit, as would the Classic FM from ZA that the PC was also playing. These findings have been notified to John Rumm as Editor. Scanning the UK DIY Wiki would be another matter; LINXCHEK does not know about reading remote pages. LINXCHEK has been improved during this activity. -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#25
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 01/04/2016 23:24, Dr J R Stockton wrote:
In uk.d-i-y message id , Tue, 15 Mar 2016 23:24:28, Dr J R Stockton .uk.invalid posted: In uk.d-i-y message , Thu, 3 Mar 2016 13:09:05, John Rumm posted: I can zip and email you the FAQ pages if you want... they should be processable with your current setup. He did. They were. Interesting. Useful to me. More later. LINXCHEK found 246 instances of candidate acronyms, 149 distinct candidates. Most came from subject lines in upper-case, and could be rejected at first glance. Some were on the existing Acronyms List. Some were too well-known to need explanation, such as ISBN and PTFE. A few do IMHO justify an explanation, either in the List or in the pages in which they appear. LINXCHEK also found 15 anchors to be cited and missing, and 10 anchors not cited from within the FAQ. And it inadvertently found quite a number of over-munged "Mailto:"s. Doing this, it read 48 HTML files in about 40 seconds on an old desktop PC - I think that most pages made a Google call on-load, which would have slowed things down a bit, as would the Classic FM from ZA that the PC was also playing. These findings have been notified to John Rumm as Editor. Scanning the UK DIY Wiki would be another matter; LINXCHEK does not know about reading remote pages. Pulling down a static copy with wget may get what you need. -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#26
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 01/04/2016 23:24, Dr J R Stockton wrote:
Some were too well-known to need explanation, such as ISBN and PTFE. I wouldn't be surprised to find my wife doesn't know what PTFE is, and certainly won't know what it stands for. Tell her it's Teflon (I know, a trade name) and she'll understand. Andy -- Hope I get it right - polytetrafluoroethylene My spiel chucker thinks I must mean polythene. |
#27
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Sun, 03 Apr 2016 21:48:44 +0100, Vir Campestris wrote:
On 01/04/2016 23:24, Dr J R Stockton wrote: Some were too well-known to need explanation, such as ISBN and PTFE. I wouldn't be surprised to find my wife doesn't know what PTFE is, and certainly won't know what it stands for. Tell her it's Teflon (I know, a trade name) and she'll understand. Hope I get it right - polytetrafluoroethylene My spiel chucker thinks I must mean polythene. Nah, polyethylene, surely? :-) -- Johnny B Good |
#28
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
if you add the word 'acronym' to the glossary page
then it will show up in search http://wiki.diyfaq.org.uk/index.php?...cronym&go =Go |
#29
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
ISBN and PTFE and IMHO
and all such TLAs should be in glossary in my opinion, shouldnt assume theyre all commonly known, we may have people from other languages viewing the wiki [g] |
#30
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On Wed, 06 Apr 2016 03:56:16 -0700, DICEGEORGE wrote:
ISBN and PTFE and IMHO and all such TLAs should be in glossary in my opinion, shouldnt assume theyre all commonly known, we may have people from other languages viewing the wiki [g] ISBN, PTFE and IMHO are all FLAs actually. :-) -- Johnny B Good |
#31
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 08/04/2016 05:08, Johnny B Good wrote:
On Wed, 06 Apr 2016 03:56:16 -0700, DICEGEORGE wrote: ISBN and PTFE and IMHO and all such TLAs should be in glossary in my opinion, shouldnt assume theyre all commonly known, we may have people from other languages viewing the wiki [g] ISBN, PTFE and IMHO are all FLAs actually. :-) ETLAs if you don't mind ;-) -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#32
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 03/04/2016 21:48, Vir Campestris wrote:
On 01/04/2016 23:24, Dr J R Stockton wrote: Some were too well-known to need explanation, such as ISBN and PTFE. I wouldn't be surprised to find my wife doesn't know what PTFE is, and certainly won't know what it stands for. Tell her it's Teflon (I know, a trade name) and she'll understand. We used to call it "Plastic Tape For Engineers" |
#33
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message ,
Sat, 2 Apr 2016 00:58:23, John Rumm posted: On 01/04/2016 23:24, Dr J R Stockton wrote: Scanning the UK DIY Wiki would be another matter; LINXCHEK does not know about reading remote pages. Pulling down a static copy with wget may get what you need. I'm boggled both in respect of how to install Wget for Windows and in respect of how to explain to it what the command line for requesting the whole of the UK DIY Wiki might be - though I have in mind a couple of simple sites to practice on. So, unless anyone can clarify these matters rather exactly, I'll move on to some other problem. I see one very minor difficulty in processing the Wiki - having written LINXCHEK for my own site, in which almost all pages are .HTM, I can predict which links are to HTML pages from an input box containing a whitespace-separated list of extensions. Obviously I cannot put an empty extension in the box as it is. That is easily solved, of course. But I would like to be able to predict, from a page containing a link, whether that link needs to be loaded and searched for further links. On my PCs, it could be a link to a 350 MB PDF of page images of Christopher Clavius' Opera Mathematica V - I do not want LINXCHEK to read that. BTW, on the plain FAQ, LINXCHEK unsurprisingly ran much faster after I removed the code which calls Google from every page. -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#34
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 10/04/2016 22:15, Dr J R Stockton wrote:
In uk.d-i-y message , Sat, 2 Apr 2016 00:58:23, John Rumm posted: On 01/04/2016 23:24, Dr J R Stockton wrote: Scanning the UK DIY Wiki would be another matter; LINXCHEK does not know about reading remote pages. Pulling down a static copy with wget may get what you need. I'm boggled both in respect of how to install Wget for Windows and in respect of how to explain to it what the command line for requesting the whole of the UK DIY Wiki might be - though I have in mind a couple of simple sites to practice on. Install is easy enough. Go to: http://downloads.sourceforge.net/gnu....4-1-setup.exe Then run the installer. For ease of use stick it into a folder close to the root of your hard drive - say c:\wg By way of example, I did a recursive get to a maximum of 1 level on the category page for doors (only used that since it was not a huge category) Made a directory for wiki stuff, then executed wget: C:\wiki\wg\wget http://wiki.diyfaq.org.uk/index.php/Categoryoors -r -l 1 The url to start at (you could the main home page if you wanted the lot). The -r recursion flag makes it walk down the tree of links it finds. And the level limit of 1 meant it did not then also repeat the search on the downloaded pages ad infinitum. That created a wiki.diyfaw.org.uk folder and then a index.php folder in that. Inside that was all the top level articles linked from the doors category - saved as straight HTML. Doing a wget --help Will show all the options. There are a few handy ones: A -k to change the links in the downloaded files to local links is handy. A -np (np parent) command will stop it following links to higher up the directory structure than your starting point. The -w to add a wait time between retrievals will stop you hammering the server too hard etc. I see one very minor difficulty in processing the Wiki - having written LINXCHEK for my own site, in which almost all pages are .HTM, I can predict which links are to HTML pages from an input box containing a whitespace-separated list of extensions. Obviously I cannot put an empty extension in the box as it is. That is easily solved, of course. --html_extension switch will force it to save all html files with a ..html suffix... But I would like to be able to predict, from a page containing a link, whether that link needs to be loaded and searched for further links. On my PCs, it could be a link to a 350 MB PDF of page images of Christopher Clavius' Opera Mathematica V - I do not want LINXCHEK to read that. The --spider switch will take it through the traversal without actually downloading anything, so you can see the links it follows. You can also exclude file types you don't want (say .pdf) BTW, on the plain FAQ, LINXCHEK unsurprisingly ran much faster after I removed the code which calls Google from every page. There is not much google code on there apart from the search box... -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#35
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message ,
Mon, 11 Apr 2016 00:43:16, John Rumm posted: On 10/04/2016 22:15, Dr J R Stockton wrote: In uk.d-i-y message , Sat, 2 Apr 2016 00:58:23, John Rumm posted: On 01/04/2016 23:24, Dr J R Stockton wrote: Scanning the UK DIY Wiki would be another matter; LINXCHEK does not know about reading remote pages. Pulling down a static copy with wget may get what you need. I'm boggled both in respect of how to install Wget for Windows and in respect of how to explain to it what the command line for requesting the whole of the UK DIY Wiki might be - though I have in mind a couple of simple sites to practice on. http://downloads.sourceforge.net/gnu....4-1-setup.exe Then run the installer. For ease of use stick it into a folder close to the root of your hard drive - say c:\wg I have C:\UTYS\ on the Path, and put batch files there to call such things, preferring to put as little as possible in C:\. I got WGET --help working, and then WGET http://wiki.diyfaq.org.uk/index.php/Categoryoors -r -l 1 . The pages obtained are readable individually, but the inter-page links do not work. I'll think and test more. BTW, on the plain FAQ, LINXCHEK unsurprisingly ran much faster after I removed the code which calls Google from every page. There is not much google code on there apart from the search box... Agreed; but it contains an Internet call to Google and uses the reply to make the box. That, done about 40-fold, is what takes up noticeable time. I added a link to HMG's "Building Regulations, Part P (Electrical Safety)" to http://wiki.diyfaq.org.uk/index.php/Electrical_regulations. -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#36
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 11/04/2016 23:46, Dr J R Stockton wrote:
In uk.d-i-y message , Mon, 11 Apr 2016 00:43:16, John Rumm posted: On 10/04/2016 22:15, Dr J R Stockton wrote: In uk.d-i-y message , Sat, 2 Apr 2016 00:58:23, John Rumm posted: On 01/04/2016 23:24, Dr J R Stockton wrote: Scanning the UK DIY Wiki would be another matter; LINXCHEK does not know about reading remote pages. Pulling down a static copy with wget may get what you need. I'm boggled both in respect of how to install Wget for Windows and in respect of how to explain to it what the command line for requesting the whole of the UK DIY Wiki might be - though I have in mind a couple of simple sites to practice on. http://downloads.sourceforge.net/gnu....4-1-setup.exe Then run the installer. For ease of use stick it into a folder close to the root of your hard drive - say c:\wg I have C:\UTYS\ on the Path, and put batch files there to call such things, preferring to put as little as possible in C:\. I got WGET --help working, and then WGET http://wiki.diyfaq.org.uk/index.php/Categoryoors -r -l 1 . The pages obtained are readable individually, but the inter-page links do not work. I'll think and test more. Try the -k switch to change the links to local ones... -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
#37
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
In uk.d-i-y message ,
Tue, 12 Apr 2016 08:19:29, John Rumm posted: On 11/04/2016 23:46, Dr J R Stockton wrote: I got WGET --help working, and then WGET http://wiki.diyfaq.org.uk/index.php/Categoryoors -r -l 1 . The pages obtained are readable individually, but the inter-page links do not work. I'll think and test more. Try the -k switch to change the links to local ones... External life intervened; but there is a degree of progress; some minor improvements within LINXCHEK have been made; some candidate acronyms have been reported from Main_Page, such as USA & GNU. Links from Main_Page to other pages of the Wiki have been reported as such; I need to see why they did not get followed, but not tonight. I used WGET http://wiki.diyfaq.org.uk/index.php/Categoryoors with arguments -r -l inf -k -np -w 20 but it seems to have attempted to fetch the whole site, including Cement_Mixing. I stopped it after 250 pages. Can you give a rough idea of how many linked pages there are in the Wiki? Does anyone know how a browser can reliably detect whether it is running in a case sensitive filesystem like UNIX or a case-independent one like Windows? -- (c) John Stockton, Surrey, UK. Turnpike v6.05 MIME. Merlyn Web Site - FAQish topics, acronyms, & links. |
#38
Posted to uk.d-i-y
|
|||
|
|||
http://wiki.diyfaq.org.uk/index.php/Main_Page
On 14/04/2016 23:38, Dr J R Stockton wrote:
In uk.d-i-y message , Tue, 12 Apr 2016 08:19:29, John Rumm posted: On 11/04/2016 23:46, Dr J R Stockton wrote: I got WGET --help working, and then WGET http://wiki.diyfaq.org.uk/index.php/Categoryoors -r -l 1 . The pages obtained are readable individually, but the inter-page links do not work. I'll think and test more. Try the -k switch to change the links to local ones... External life intervened; but there is a degree of progress; some minor improvements within LINXCHEK have been made; some candidate acronyms have been reported from Main_Page, such as USA & GNU. Links from Main_Page to other pages of the Wiki have been reported as such; I need to see why they did not get followed, but not tonight. I used WGET http://wiki.diyfaq.org.uk/index.php/Categoryoors with arguments -r -l inf -k -np -w 20 but it seems to have attempted to fetch the whole site, including Cement_Mixing. I stopped it after 250 pages. All the pages include links "up" the hierarchy - so for example there is a link to the index and the list of categories on the side bar of every page. So unless you avoid following those, you will in the end grab the whole site. Can you give a rough idea of how many linked pages there are in the Wiki? 591 actual content pages, but 3245 if you include all the talk pages and redirects etc. Does anyone know how a browser can reliably detect whether it is running in a case sensitive filesystem like UNIX or a case-independent one like Windows? If you retrieve the navigator.userAgent string, it will include the OS as well as the browser. For example on a win 8.1 machine you might get: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0 -- Cheers, John. /================================================== ===============\ | Internode Ltd - http://www.internode.co.uk | |-----------------------------------------------------------------| | John Rumm - john(at)internode(dot)co(dot)uk | \================================================= ================/ |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
UK DIY Wiki : http://wiki.diyfaq.org.uk/index.php/Mould | UK diy | |||
diyfaq correction | UK diy | |||
www.wiki.diyfaq.org.uk/ is broken | UK diy | |||
http://bbs.homeshopmachinist.net/ | Metalworking | |||
http://drpcdr.ca | Electronics Repair |