python-pptx icon indicating copy to clipboard operation
python-pptx copied to clipboard

feature: Slide.duplicate()

Open AlexMooney opened this issue 10 years ago • 84 comments

In order to create a presentation by populating template slides with dynamic data As a developer using python-pptx I need the ability to clone a slide

API suggestion:

cloned_slide = prs.slides.clone_slide(original_slide)

The cloned slide would be appended to the end of the presentation and would be functionally equivalent to copying and pasting a slide using the PPT GUI.

AlexMooney avatar Nov 12 '14 20:11 AlexMooney

Hi Alex, can you describe your use case for us please?

scanny avatar Nov 12 '14 21:11 scanny

The goal is to automatically generate a couple dozen presentations when the data they present is updated. Every couple of weeks, we get updates and I want to save everyone from having to enter the new stuff in PPT.

I have made a template deck with a few placeholder type slides (there happens to be 3 styles of slide per presentation). My program reads text files to inject arbitrary data into the tables, text boxes, and charts on the placeholder slides. The next feature I need to figure out is that sometimes a deck will have 4 slides of type B instead of the usual 1. I'd like to e.g. duplicate the original slide B three times then inject data into each of those.

One of the design constraints is that the users of this program won't be able to maintain programmatically generated slides, so I'm modifying a template pptx file that they'll be able to make changes to as their needs evolve, rather then using things like add_slide, add_chart, and so forth from within Python. The only way I know of to do this now is to make the template have many redundant placeholder slides and have the end user manually delete the unused ones.

AlexMooney avatar Nov 12 '14 22:11 AlexMooney

Can you say a little more about the design constraint you mention? I'm not clear whether you're saying your users won't be able to modify code (understandable :), or whether you don't want them to be able to modify the slides after they're generated, or perhaps something else entirely.

If you can give an idea of the motivations behind that constraint I think that will make it much clearer.

scanny avatar Nov 21 '14 04:11 scanny

This would be very useful! I need the same option+ some more for inserting an empty slide! as I'm trying to generate a pptx automatically which grabs values from user report and converts it from database to pptx presentation. I've made my own function for inserting/copying a slide, but still i'm waiting for this option since python-pptx v0.3.2, please make this featured!

ghost avatar Nov 21 '14 12:11 ghost

They won't be able to modify the Python code. They will definately have to modify the slides by hand after it's populated with the data but before it's shown to managers. :)

The workaround I've since arrived upon is to have them manually make the pptx decks by hand and then my script goes back in and fills in all the data that it knows how to handle. Now, if they want to have a 4 page ppt with an extra chart on page 3, they make that in PowerPoint, but leave the data out, and my stuff fills it in. It would have been nicer to skip the manual portion, but my use case is mostly covered.

AlexMooney avatar Nov 21 '14 18:11 AlexMooney

Ah, ok, I think I see what you're trying to do now.

It sounds like you're essentially having the users maintain a "template" presentation which your code then uses as a base into which to insert the latest content.

It's an interesting approach. The main benefit as I see it being the end-users can use PowerPoint to tweak the template slides themselves.

We've had a bunch of folks address similar challenges by just building up the slide in its entirety using python-pptx code. This has the advantage that you don't have to get rid of empty slides and so forth, but it doesn't allow end-users to tweak the slide template. They'd have to come to you for code every time something formatting or ancillary copy needed to change.

I'll leave this one open as a possible future feature. It turns out to be trickier than one might expect to implement in the general case because there are quite a number of possible connections between a slide and other parts of the presentation. But we'll give it some more noodling :)

scanny avatar Nov 22 '14 00:11 scanny

I just have the same usecase. Expecting for your APIs. :)

children1987 avatar Jan 14 '15 15:01 children1987

I'm interested in doing the exact same thing as @AlexMooney.

@scanny: you said that it was difficult to implement in the general case, as there may be complicated linkages between slides. The presentations I'm working with won't have those complicated connections - would you be able to give any hints as to the best way to implement a duplicate() function that works in a limited case of simple presentations? At the moment I'm trying to loop through shapes and add them, but that requires a lot of logic to deal with shape types, copying subobjects etc

robintw avatar May 13 '15 08:05 robintw

I am also interested in doing the exact same thing as @AlexMooney.

My use case is that the client wants to be able to supply any one from a structurally equivalent set of 2-slide templates, and have python code that will create new slides populated from dynamic test data. So we would like to duplicate from one of the template slides, and insert at the end of the deck. Upon saving the presentation, the 2 template slides would be stripped out.

Does anyone know if these features have already been implemented in python-pptx?

karlschiffmann avatar Oct 13 '15 20:10 karlschiffmann

I don't know of it being implemented in python-pptx, but I have a pretty-awful implementation that works for my use case - but I should warn you that it may well not work for some of your situations, and is likely to be very buggy!

The code is below, and should be reasonably self-explanatory - but I warn you, it may well fail (and definitely only works for duplicating slides within presentations - for copying between presentations you open up a whole other can of worms)

def _get_blank_slide_layout(pres):
    layout_items_count = [len(layout.placeholders) for layout in pres.slide_layouts]
    min_items = min(layout_items_count)
    blank_layout_id = layout_items_count.index(min_items)
    return pres.slide_layouts[blank_layout_id]

def duplicate_slide(pres, index):
    """Duplicate the slide with the given index in pres.

    Adds slide to the end of the presentation"""
    source = pres.slides[index]

    blank_slide_layout = _get_blank_slide_layout(pres)
    dest = pres.slides.add_slide(blank_slide_layout)

    for shp in source.shapes:
        el = shp.element
        newel = copy.deepcopy(el)
        dest.shapes._spTree.insert_element_before(newel, 'p:extLst')

    for key, value in six.iteritems(source.rels):
        # Make sure we don't copy a notesSlide relation as that won't exist
        if not "notesSlide" in value.reltype:
            dest.rels.add_relationship(value.reltype, value._target, value.rId)

    return dest

robintw avatar Oct 13 '15 21:10 robintw

I very much appreciate that! Will try it out... trying to decide whether to use python-pptx or just via MSPPT.py. Thanks again.

On Tue, Oct 13, 2015 at 2:10 PM, Robin Wilson [email protected] wrote:

I don't know of it being implemented in python-pptx, but I have a pretty-awful implementation that works for my use case - but I should warn you that it may well not work for some of your situations, and is likely to be very buggy!

The code is below, and should be reasonably self-explanatory - but I warn you, it may well fail (and definitely only works for duplicating slides within presentations - for copying between presentations you open up a whole other can of worms)

def _get_blank_slide_layout(pres): layout_items_count = [len(layout.placeholders) for layout in pres.slide_layouts] min_items = min(layout_items_count) blank_layout_id = layout_items_count.index(min_items) return pres.slide_layouts[blank_layout_id] def duplicate_slide(pres, index): """Duplicate the slide with the given index in pres. Adds slide to the end of the presentation""" source = pres.slides[index]

blank_slide_layout = _get_blank_slide_layout(pres)
dest = pres.slides.add_slide(blank_slide_layout)

for shp in source.shapes:
    el = shp.element
    newel = copy.deepcopy(el)
    dest.shapes._spTree.insert_element_before(newel, 'p:extLst')

for key, value in six.iteritems(source.rels):
    # Make sure we don't copy a notesSlide relation as that won't exist
    if not "notesSlide" in value.reltype:
        dest.rels.add_relationship(value.reltype, value._target, value.rId)

return dest

— Reply to this email directly or view it on GitHub https://github.com/scanny/python-pptx/issues/132#issuecomment-147854893.

karlschiffmann avatar Oct 13 '15 21:10 karlschiffmann

duplicating slides within presentations would be a great feature. the above snippet works, but doesn't seem to copy the table cells correctly (it replicated the tables on the initial slide, but i can't add paragraphs to them). Maybe there is something else going on there? not deep copying the text elements?

mtbdeano avatar Oct 16 '15 16:10 mtbdeano

Ah yes, I've never tried it with slides containing tables, so it probably doesn't work properly for those. I'm afraid I haven't got time to investigate the problem with tables, but if you do manage to fix it then let me know.

robintw avatar Oct 16 '15 16:10 robintw

actually, that code works fine with tables! the error was on my side, needed to make sure it was a deepcopy (i was using a weird library for that), all works great, you should add the duplicate slide method to the slide object, with the caveat it only works within a presentation.

mtbdeano avatar Oct 16 '15 19:10 mtbdeano

@scanny: Would you be interested in this being added as a method to the slide object?

robintw avatar Oct 16 '15 19:10 robintw

Yes, thank you.

On Fri, Oct 16, 2015 at 12:50 PM, Robin Wilson [email protected] wrote:

@scanny https://github.com/scanny: Would you be interested in this being added as a method to the slide object?

— Reply to this email directly or view it on GitHub https://github.com/scanny/python-pptx/issues/132#issuecomment-148818115.

karlschiffmann avatar Oct 16 '15 20:10 karlschiffmann

@robintw: I would, of course :) Probably best if you start with an analysis document like the ones you find here so we can think through the required scope. There's not really a place for methods that only work for certain cases, so we'd need to work out what the scope of the general case would be and account for the various bits. If I recall correctly, the tricky bit on this one is to make sure relationships to external items like images and hyperlinks are properly taken care of.

Then of course you would need to provide the tests. I think that bit is the most rewarding, in the sense it makes you a better programmer, but seems to be beyond the abilities of most contributors.

Let me know if you're still keen and we can get started.

scanny avatar Oct 17 '15 12:10 scanny

Unfortunately I don't think I'll have the time to engage with this project to that extent. I'm significantly involved in a number of other open-source projects, while also working full-time - and I just can't commit to do this work properly.

I'm more than happy for anyone else who has time to take the code that I've posted in this issue and integrate it with python-pptx, or just use it themselves.

robintw avatar Oct 17 '15 12:10 robintw

Sorry, I am not in a position to work on this right now either...

On Sat, Oct 17, 2015 at 5:50 AM, Robin Wilson [email protected] wrote:

Unfortunately I don't think I'll have the time to engage with this project to that extent. I'm significantly involved in a number of other open-source projects, while also working full-time - and I just can't commit to do this work properly.

I'm more than happy for anyone else who has time to take the code that I've posted in this issue and integrate it with python-pptx, or just use it themselves.

— Reply to this email directly or view it on GitHub https://github.com/scanny/python-pptx/issues/132#issuecomment-148914458.

karlschiffmann avatar Oct 26 '15 18:10 karlschiffmann

Thank you all for your fantastic work! I'd like to try to fix this. But I haven't contribute before, so I have no idea of weather I can do it well. What's more, my mother language is Chinese, so, if I ask some silly questions with my poor English for help, could you forgive me? @robintw @scanny

children1987 avatar Nov 01 '15 15:11 children1987

@children1987: You're entirely welcome to give it a try :) We'll have to see what you can come up with to get an idea how far away you'd be from getting a commit.

You'd need to be able to explain what you're doing and also write the tests. Those are the harder parts, so most people just write the code and don't bother with those bits; but they are what makes the library robust, so we can't accept a pull request without them.

Your English seems good enough so far. I'm sure we can manage to fix up the grammar and so on if your analysis is sound.

scanny avatar Nov 02 '15 21:11 scanny

@robintw thank you With slight modification of code, it is able to copy slide from template to new ppt. It is a great improvement to me. My proj is ppt report auto generation, I made many ppt template for various requirement before . now I can summary identical format into one ppt , then do the iteration of slide copy and content substitution. known bug: bg and some format will be lost. Although it is not critical for me, hope you can help .

    def _get_blank_slide_layout(pres):
         layout_items_count = [len(layout.placeholders) for layout in pres.slide_layouts]
         min_items = min(layout_items_count)
         blank_layout_id = layout_items_count.index(min_items)
         return pres.slide_layouts[blank_layout_id]

    def copy_slide(pres,pres1,index):
         source = pres.slides[index]

         blank_slide_layout = _get_blank_slide_layout(pres)
         dest = pres1.slides.add_slide(blank_slide_layout)

         for shp in source.shapes:
              el = shp.element
              newel = copy.deepcopy(el)
              dest.shapes._spTree.insert_element_before(newel, 'p:extLst')

              for key, value in six.iteritems(source.rels):
                         # Make sure we don't copy a notesSlide relation as that won't exist
                       if not "notesSlide" in value.reltype:
                               dest.rels.add_relationship(value.reltype, value._target, value.rId)

              return dest

zhong2000 avatar Jan 19 '16 08:01 zhong2000

The code samples above for duplication of a slide use a variable or module called 'six' for function 'iteritems' which is not defined or clear here. Can you expand on your imports or refactor to use standard iteritems tools?

I want to copy a complex template slide 50 times, performing text substitution on each duplicate per some other data, then delete the template slide.

hariedo avatar May 29 '17 02:05 hariedo

@hariedo six is a module that provides some compatibility helpers to make code work on Python 2 and 3. The documentation for the six.iteritems method (available at https://pythonhosted.org/six/#six.iteritems) states:

"Returns an iterator over dictionary‘s items. This replaces dictionary.iteritems() on Python 2 and dictionary.items() on Python 3."

So you should be able to replace it with whichever of these applies to your Python version.

robintw avatar May 30 '17 19:05 robintw

@robintw, thanks for that. My example ppt files don't seem to have any data that would cause source.rels or dest.rels to exist. I get 99% of what I want with that chunk of code removed, but wonder what I will be missing if I leave it out.

hariedo avatar Jun 05 '17 06:06 hariedo

Images would be the most likely. Hyperlinks are also "relationship" objects, along with charts, smart art, and media (video, audio). There are a couple others that are more obscure.

scanny avatar Jun 05 '17 16:06 scanny

@robintw @zhong2000 Thanks for the code. I am trying to apply it to charts and also getting the AttributeError: 'Slide' object has no attribute 'rels' when accessing source.rels.

Removing the code doesn't copy charts correctly.

I guess you are opening the presentation differently than pres = Presentation(file_name)?

cschrader avatar Jun 13 '17 08:06 cschrader

The internals changed a while back to extract a new SlidePart class from the Slide class. The .rels attribute moved over with the SlidePart object. The SlidePart object for a slide is accessed using the .part property on the Slide object.

So where you previously would have slide.rels you would now need slide.part.rels. I believe in the code above the change would be source.rels -> source.part.rels and dest.rels -> dest.part.rels.

scanny avatar Jun 13 '17 17:06 scanny

Here is my use case: generate product report by ppt every week, each week product number was different. Must have the way to auto create same page like copy existed page for next product.

Situation: Use ppt Slide mastering save your time on search the API from python-pptx.

  1. all of below's way can't work on python3.5 with python-pptx 0.6.6. (also can check here https://github.com/scanny/python-pptx/issues/238)
  2. Make Slide mastering in ppt . then you can easy use pre.slides.add_slide(pre.slide_layout[index]) to feed your reqeust.

Hope this item will save people time.


No The slide mastering didn't work. Those chat put into mastering will be background not allow for edit. I try to find other way.

Quanjiang avatar Jul 11 '17 11:07 Quanjiang

Any ideas on why this duplicate method corrupts the output (Python 2.7, python-pptx 0.6.6)? I just have some tables and charts, I have tried before and after filling them, but corrupts the file just the same.

I wouldn't mind any tool to check what's repairing Powerpoint so I can give more information.

Thanks for your work!

Bretto9 avatar Jul 14 '17 15:07 Bretto9