Talk:Gedbot

From Rodovid Engine

Jump to: navigation, search

Contents

[edit] Urgent ? (for Baya)

I just wanted to create a New person in engine with my program... but I fear I flushed the person "New" because now when I want to see "add a person" in XML, there nothing (before there was <record><person>... and so on)

If I made a mistake please apologize !!

I think I have some problem when I want to send back modified XML via wikipedia.py library.

I go on holidays tomorrow morning so I leave engine quiet two weeks !

pfrappe 20:09, 28 July 2011 (EEST)

Don't be afraid ))), maybe, there were some zero events in the engine db. So you just have cleared the wrong events ))) Check out the recent changes. There you can find all your actions. In any case, you can do anything with the engine db ))) --Baya 11:33, 29 July 2011 (EEST)

[edit] Tests?

Privit Baya, Salut Pierre.

You are talking of a very sophisticated solution, insn'it?

Maybe we could go step by step...

Why not a first simple bot, working with only latin caracters? I think this was the first idea... I guess the 23000 persons Gedcom file that Pierre want to upload in RD is:

  • without people with possibly cyrillic written names
  • Oldest persons of that gedcom were living in the XVIIth century (this is an important limit in french genealogy: end of first hand source)
  • Without aristocrats or with a few aristocrats

So probability of double records (between this gedcom file and actual Rd DB) is very low.

Why not build that first bot, working in latin, test it with little files, first, with big files after, and watch how it works?

Alain and Dominique (admins) propose themselves to test that possible future bot. (See that page: fr:Gedbot).

The idea is to gather knowleges and work-capacities... When we'll have that first bot, it will make people gathering around the problem. People are waiting for solutions of merging gedcoms online.

Maybe an other genealogist-programmer will contribute, build a bot searching double records, and then work on something going through latin and cyrillic, and then hebrew, kanjis etc.

Don't you think so? --Christophe Tesson - talk. 19:50, 23 July 2011 (EEST)

  • I think the first step is a simple bot for working with RD. (It also can synchronize RD with Local files and vice versa). --Baya 12:48, 25 July 2011 (EEST)

[edit] Planning August 2011

Hello, Baya, Christophe, and others !

I agree totally with your opinions ! By now I'm working on a bot working for Latin alphabet (but it's always interesting to discuss ideas for the future : it doesn't mean I'm really doing that !)

  • I began to learn Python last week (fine and easy language !) and I'll spent my holidays from July 29th to August 16th
  • I studied wikipedia.py library (Thx Christophe). This library is usefull but doesn't work correctly with Rodovid because Rodovid page has a <form> (as far as I understand, wikipedia.py works only with one <textearea> per page). So, I'm writing a special adpatation working with Rodovid (with engine in fact)
  • at this time, I can get the datas of an existing person form Rodovid with his refnum and I can work on these datas
  • I'm trying to rewrite back it correctly and create links between persons. By now, it dosen't works and I dont know why (problem of datas in POST request, I think).
  • When I'll be able to write datas with my bot, I'll need help from Baya to control if all is correct. And may be, make test to evaluate the load on server. Certainly not before middle of August.
  • after that, during end of August, I'll write the program reading GEDCOM and puting datas in engine.
  • then, the dialog with wome Rodovid requests to prevent doublons.
  • and finally, I'll have to create the dialog with the user who's importing the Gedcom file.
  • I always work in engine with my own datas (till XIVth century, it may be correct, not isolated person, not null person)
  • Then I'll try to create some doublons in engine and retry to detect it.

In my opinion, we must walk slowly. But, if it can work, it could be great. I hope a first version could be available during september.

In the past, I had already made a 'sniffer' that was able to retrieve data's from Geneanet (or geneweb sites as Roglo) into my own database. It gave me much help and prevents me before create doublons.

What about reusing this part of my program (I mean on an ethic point of view) ? Does Rodovid team had discussions about copying datas from other sites ? (if we note the origin, of course !)

Pfrappe 09:22, 26 July 2011 (EEST)

Hello Pfrappe,

1) To avoid the "form" parsing you can set "Edit record as" <XML|GEDCOM> in your preferences .

2) I don't know any discussion at RD about records copying between sites (except the discussion about merging with WikiTree). As far as I know, genealogical data can't be licensed, so we can copy it from any site to any other. An IMPORTANT thing is that we must correctly define the REFNUMS. In that case we can use even automatic crosslinking between records in different sites. Thereby we simplify the search process for other people. Therefore the convenience and the usefulness are important.

--Baya 10:55, 26 July 2011 (EEST)

[edit] Links with other genalogical databases

About inter sites Xref, in my database, I store all the URLs links to other pages refering to the same person. So I can automatically watch if som page has changed, or get parents or childrens. I can also avoid navigate thru a site if I already got the pages.

  • May be we could have in Rodovid a special field <link_to_external_databases><link>url1</link><link>url1</link> to store such links ?

It is a bit different from 'wikipages' field because this new field should refer only to genealogical pages. By that, we could, in the future, have a bot comparing Rodovid pages and other pages, adding automatically parents and children if not yet in Rodovid. An idea...

I think that the url in other db is a real refnum. So you can just add a new refnum. --Baya 11:24, 27 July 2011 (EEST)

For Geneanet and Roglo, unfortunetaly the id (id=123456) is not constant !! It is recalculated each time databases are restructured.

I already had discussion with Geneanet or Roglo administrators asking to create a permanent id (as refnum in Rodovid) but they all answer to me that it was too complicated : If I want a permanent link to a person in Geneweb databases, we must use the url composed with p=Pierre;n=Frappe;oc=number of occurences ...

(In fact, if somebody changes the firstname or the last name, it becomes a new link and th oc= changes too ! So it's not a permanent link but it is just more constant than the geneweb id)

pfrappe 19:16, 27 July 2011 (EEST)

In this case it is possible to use the current url with the time when this url was true. In future we can create a robot that will check these urls and update RD record with new one (if it is possible to detect new url automatically) or mark these url as "old" or will send message to the record creator about changed url. --Baya 10:51, 28 July 2011 (EEST)

[edit] Edit with XML Format or Gedcom format

Edit record preferences are absolutely great !! I begin to create and update some records.

  • Is it possible to toggle edit format only for the bot (without changing preferences ?) (passing a parameter ?)
  • If I send a modified text (XML or Gedcom), does it replace the precedent one or is integrated in it ?
  • in Gedcom format, ther is a 2 NOTE RDEVENTID @11264@ but nothing like that XML. Is it important ? I guess it's the event number ? If I create an event, I just have to add it somewhere in the page.
  • in Gedcom, I guess all NOTE with RDSOMETHING are for Rodovid ?
  • In XML, to create family links, I think I only have to create <parents> field, not <childrens> ?
  • In Gedcom, for the parents, I see FAMC @:fatherrefum+motherrefnum@ in engine but a family number in Rodovid. Right ?
  • I think I'll use Gedcom format to create (probably simpler), but XML format to modify (more structured)

Thanks Pfrappe 11:43, 26 July 2011 (EEST)

Wow! Xmas in september? --Christophe Tesson - talk. 15:55, 26 July 2011 (EEST)

  1. The better way is to create a special account. And I will set bot rights for this account.
  2. software must replace all in the current localization and the global db.
  3. yes. this is internal number of event. xml:ev_id === gedcom:RDEVENTID. If you want to change the event use this id, in other case new event will be created and old will be deleted (so old events in other localisation will be not connected to this. And will be changed/removed only after manual updating)
  4. yes
  5. RD links persons in other way compare to GEDCOM. Gedcom links persons throw they family. RD links by persons. RD persons pages does not contains any information about they families. Because of gedcom does not have a field for parents, I use id of parents common family (FAMC @family_id@) or if they did not have common family – the ids of parents (FAMC @:fatherrefum+motherrefnum@). In xml there no links to family. Xml (as form) contains direct links to parents and children. It is not require to update the children records if parent with <children> added. And vice versa.
  6. This is your choice ))) Important requirement for true processing of FAMC and CHIL fields by RD software. You must use real RD ids of person in this fields. So if you want to use gedcom for new records it will be convenient use a "FAMC @:father_id+mother_id@" for children with already imported parents.
--Baya 12:06, 27 July 2011 (EEST)

[edit] Good news

Pierre Frappé told in his french user talk page, that he's still working on Gedcom. So, this project is still running! Please hold the line! --Christophe Tesson - talk. 14:47, 14 September 2011 (EEST)

Personal tools